Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

beam search cannot be parsed correctly. #2385

Closed
lcy-seso opened this issue Jun 5, 2017 · 7 comments · Fixed by #2384
Closed

beam search cannot be parsed correctly. #2385

lcy-seso opened this issue Jun 5, 2017 · 7 comments · Fixed by #2384
Assignees
Labels

Comments

@lcy-seso
Copy link
Contributor

lcy-seso commented Jun 5, 2017

I0605 18:50:24.263185 18007 Util.cpp:166] commandline:  --use_gpu=False --trainer_count=1
F0605 18:50:46.687893 18007 NeuralNetwork.h:94] Check failed: it != layerMap_.end() Unknown layer __decoder_group_eos_layer__@decoder_group
*** Check failure stack trace: ***
    @     0x7f7019fed7bd  google::LogMessage::Fail()
    @     0x7f7019ff126c  google::LogMessage::SendToLog()
    @     0x7f7019fed2e3  google::LogMessage::Flush()
    @     0x7f7019ff277e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f7019c2afef  paddle::NeuralNetwork::getLayer()
    @     0x7f7019c3932c  paddle::RecurrentGradientMachine::resizeOrCreateFrames()
    @     0x7f7019c3873b  paddle::RecurrentGradientMachine::init()
    @     0x7f7019bd3be9  paddle::RecurrentLayerGroup::initSubNetwork()
    @     0x7f7019c285c5  paddle::NeuralNetwork::init()
    @     0x7f7019c52bde  paddle::GradientMachine::create()
    @     0x7f7019fc10cc  GradientMachine::createFromPaddleModelPtr()
    @     0x7f7019fc120d  GradientMachine::createByConfigProtoStr()
    @     0x7f7019ab07d5  _wrap_GradientMachine_createByConfigProtoStr__SWIG_0
    @     0x7f7019ab0e10  _wrap_GradientMachine_createByConfigProtoStr
    @     0x7f7021f513a3  PyEval_EvalFrameEx
    @     0x7f7021f53130  PyEval_EvalCodeEx
    @     0x7f7021f514a1  PyEval_EvalFrameEx
    @     0x7f7021f53130  PyEval_EvalCodeEx
    @     0x7f7021f514a1  PyEval_EvalFrameEx
    @     0x7f7021f53130  PyEval_EvalCodeEx
    @     0x7f7021edf27d  function_call
    @     0x7f7021eb70f3  PyObject_Call
    @     0x7f7021ec9f7f  instancemethod_call
    @     0x7f7021eb70f3  PyObject_Call
    @     0x7f7021f0d80e  slot_tp_init
    @     0x7f7021f0c478  type_call
    @     0x7f7021eb70f3  PyObject_Call
    @     0x7f7021f50887  PyEval_EvalFrameEx
    @     0x7f7021f53130  PyEval_EvalCodeEx
    @     0x7f7021f514a1  PyEval_EvalFrameEx
    @     0x7f7021f51c56  PyEval_EvalFrameEx
    @     0x7f7021f53130  PyEval_EvalCodeEx

Besides eos_layer, several other layers are not created in generation.

@lcy-seso lcy-seso added the Bug label Jun 5, 2017
@lcy-seso lcy-seso self-assigned this Jun 5, 2017
@lcy-seso
Copy link
Contributor Author

lcy-seso commented Jun 6, 2017

以上错误的原因是:

  1. 当调用 paddle.layer.A 时将 A 的信息写入 g_layer_map 这个全局的 layer 字典中

  2. 一个 layer 的 output 成为下一个 layer 的 input,output-->input 的记录了网络的连接性。

  3. 指定一个”根“结点,以广度优先遍历选择出用到的layer:https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/layer.py#L141

  4. 生成任务中, eos_id 比较特殊。生成任务指定下面的 max_id 来作为”根“,然后遍历 max_id 的每一个输入,而 eos 以 max_id 作为输入,在遍历的时候不会被选择。

                      /--> eos
    softmax --> max_id 
    

@lcy-seso
Copy link
Contributor Author

lcy-seso commented Jun 6, 2017

  1. eos layer 接近于 Topology 类中的 extra_layers
  2. 或者换一个角度,包在 layer_group 之中的一组layer 都应该被认为是 ”被使用“,而总是被选择。

@lcy-seso
Copy link
Contributor Author

lcy-seso commented Jun 6, 2017

如果不通过规则来修正 eos_layer 在生成时这种比较特殊的情况,怎样的修改会更好一些呢?@emailweixu

例如下面这种方式:

__get_used_submodels__ 时,如果 submodels 是 recurrent_layer_group,检查一个 layer group
中的 layer 是否都被加入 layer_names,如果没有加入,则添加。

@lcy-seso
Copy link
Contributor Author

lcy-seso commented Jun 6, 2017

def __get_used_submodels__(layer_names):
    submodel_names = set()
    for submodel in cp.g_config.model_config.sub_models:
        if submodel.name in layer_names:
            submodel_names.add(submodel.name)
            if submodel.is_recurrent_layer_group:
                layer_names |= set(submodel.layer_names)
    return submodel_names

按照上述代码,修改 __get_used_submodels__ 函数,将 recurrent_layer_group 括起来的一组 layer 总认为是一组被使用的 layer,可以解决生成过程 eos_layer 未定义的问题。

但会出现另一个 bug:

  1. Generator 定义了一个 data_layer,来存储 predict_word,这个data_layer 会被自动识别为网络的输入层: https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/layer.py#L141
  2. 生成时会在这一行挂掉:https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/topology.py#L95

出错信息如下:

Traceback (most recent call last):
  File "train.py", line 216, in <module>
    main()
  File "train.py", line 189, in main
    field=['prob', 'id'])
  File "/home/caoying/paddle_codes/paddle_book/08.machine_translation/paddle/v2/inference.py", line 133, in infer
    inferer = Inference(output_layer=output_layer, parameters=parameters)
  File "/home/caoying/paddle_codes/paddle_book/08.machine_translation/paddle/v2/inference.py", line 41, in __init__
    self.__data_types__ = topo.data_type()
  File "/home/caoying/paddle_codes/paddle_book/08.machine_translation/paddle/v2/topology.py", line 95, in data_type
    for nm in self.proto().input_layer_names]
KeyError: u'__beam_search_predict__'

@lcy-seso
Copy link
Contributor Author

lcy-seso commented Jun 6, 2017

生成时这个特殊的 data_layer 可以用比较简单的方式解决,因为,这个data layer 实际上恰好已经被指定为整个网络的输出,可以判断已经被指定为输出的 data_layer 不再被加入作为整个网络的输入。

正在测试。

@emailweixu
Copy link
Collaborator

emailweixu commented Jun 6, 2017

eos这个问题修改一下v2/layer.py 里的 def add_additional_parents() 可以解决

@lcy-seso
Copy link
Contributor Author

lcy-seso commented Jun 6, 2017

通过 add_additional_parents 修改 eos_layer 的问题,在这个PR #2384 中。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants