Method for training strategy model and method and device for determining advertisement putting strategy

The embodiment of the invention provides a method for training a strategy model and a method and device for determining an advertisement putting strategy. The method for training the strategy model comprises the steps that sample information and network parameters related to advertisement putting ar...

Full description

Saved in:
Bibliographic Details
Main Author ZHOU PENGCHENG
Format Patent
LanguageChinese
English
Published 04.09.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The embodiment of the invention provides a method for training a strategy model and a method and device for determining an advertisement putting strategy. The method for training the strategy model comprises the steps that sample information and network parameters related to advertisement putting are acquired; calculating a loss value of the evaluation network according to the state of the first moment, the state of the second moment, the advertisement putting strategy of the first moment, the reward value of the first moment and the first network parameter; updating the first network parameter by using the loss value of the evaluation network to obtain a third network parameter; calculating the gradient of the second network parameter according to the state of the first moment, the advertisement putting strategy of the first moment, the second network parameter and the third network parameter; and updating the second network parameter according to the gradient of the second network parameter. According to th
Bibliography:Application Number: CN202010446815