Method for training strategy model and method and device for determining advertisement putting strategy
The embodiment of the invention provides a method for training a strategy model and a method and device for determining an advertisement putting strategy. The method for training the strategy model comprises the steps that sample information and network parameters related to advertisement putting ar...
Saved in:
Main Author | |
---|---|
Format | Patent |
Language | Chinese English |
Published |
04.09.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The embodiment of the invention provides a method for training a strategy model and a method and device for determining an advertisement putting strategy. The method for training the strategy model comprises the steps that sample information and network parameters related to advertisement putting are acquired; calculating a loss value of the evaluation network according to the state of the first moment, the state of the second moment, the advertisement putting strategy of the first moment, the reward value of the first moment and the first network parameter; updating the first network parameter by using the loss value of the evaluation network to obtain a third network parameter; calculating the gradient of the second network parameter according to the state of the first moment, the advertisement putting strategy of the first moment, the second network parameter and the third network parameter; and updating the second network parameter according to the gradient of the second network parameter. According to th |
---|---|
Bibliography: | Application Number: CN202010446815 |