Skip to content

Commit de1a29d

Browse files
committed
misunderstood how activation functions were applied
1 parent dfda498 commit de1a29d

File tree

2 files changed

+8
-7
lines changed

2 files changed

+8
-7
lines changed

gateloop_transformer/gateloop_transformer.py

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -139,15 +139,16 @@ def binary_operator(a, b):
139139
a_i, kv_i = a
140140
a_j, kv_j = b
141141

142-
# unsure, but i think this is what the paper was doing
143-
# feel free to open an issue if not
144-
145-
a_i = a_i.real.sigmoid() + 1.j * a_i.imag
146-
a_j = a_j.real.sigmoid() + 1.j * a_j.imag
147-
148142
return a_j * a_i, a_j.real * kv_i + kv_j
149143

150144
a = rearrange(a, '... -> ... 1')
145+
146+
# activations for state transitions
147+
# sigmoid for magnitude, identity for phase
148+
149+
magnitude, phase = a.abs(), a.angle()
150+
a = torch.polar(magnitude.sigmoid(), phase)
151+
151152
_, kv = associative_scan(binary_operator, (a, kv), axis = 1)
152153

153154
return einsum('b n d, b n d e -> b n e', q, kv)

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
setup(
44
name = 'gateloop-transformer',
55
packages = find_packages(exclude=[]),
6-
version = '0.0.6',
6+
version = '0.0.7',
77
license='MIT',
88
description = 'GateLoop Transformer',
99
author = 'Phil Wang',

0 commit comments

Comments
 (0)