使用seq2seq做知识提取

作者

2023年08月22日

更新时间

11.21 分钟

阅读时间

阅读量

尝试使用reformer和的seq2seq做知识提取。
借助reformer-pytorch里面提供的demo。
https://github.com/lucidrains/reformer-pytorch


import torch
from reformer_pytorch import ReformerEncDec

DE_SEQ_LEN = 4096
EN_SEQ_LEN = 4096

enc_dec = ReformerEncDec(
    dim = 512,
    enc_num_tokens = 20000,
    enc_depth = 6,
    enc_max_seq_len = DE_SEQ_LEN,
    dec_num_tokens = 20000,
    dec_depth = 6,
    dec_max_seq_len = EN_SEQ_LEN
).cuda()

train_seq_in = torch.randint(0, 20000, (1, DE_SEQ_LEN)).long().cuda()
train_seq_out = torch.randint(0, 20000, (1, EN_SEQ_LEN)).long().cuda()
input_mask = torch.ones(1, DE_SEQ_LEN).bool().cuda()

loss = enc_dec(train_seq_in, train_seq_out, return_loss = True, enc_input_mask = input_mask)
loss.backward()
<h1>learn</h1>
<h1>evaluate with the following</h1>
eval_seq_in = torch.randint(0, 20000, (1, DE_SEQ_LEN)).long().cuda()
eval_seq_out_start = torch.tensor([[0.]]).long().cuda() # assume 0 is id of start token
samples = enc_dec.generate(eval_seq_in, eval_seq_out_start, seq_len = EN_SEQ_LEN, eos_token = 1) # assume 1 is id of stop token
print(samples.shape) # (1, <= 1024) decode the tokens

使用seq2seq做知识提取

相关标签

"如果google docs没法用了,国内又有哪个可以替代呢?百会,快写又或者是微软..."

sk才是大成者，深度学习中还能屹立不倒

博客作者

GLM 是真敢删啊？！说好的 P0 安全规范呢？

如果要投票一个最弱智的ai模型一定是千问

告别手动拼接：PromptForge 如何重新定义你的 AI 工作流

Privacy Policy for TerryVoiceRead Chrome Extension

告别龟速！NAS迅雷内测体验，速度起飞，附邀请码！

相关标签

"如果google docs没法用了,国内又有哪个可以替代呢?百会,快写又或者是微软..."

sk才是大成者，深度学习中 还能屹立不倒

博客作者

sk才是大成者，深度学习中还能屹立不倒