reinforcement learning algorithm中文什么意思

發(fā)音:

用"reinforcement learning algorithm"造句

強(qiáng)化式學(xué)習(xí)算法
強(qiáng)化學(xué)習(xí)算法

reinforcement: n. 1.增強(qiáng)，加固；補(bǔ)強(qiáng)物，強(qiáng)化物；補(bǔ)給品。 2.增援 ...
learning: n. 學(xué)，學(xué)習(xí)；學(xué)問(wèn)，學(xué)識(shí)；專門知識(shí)。 good at ...
algorithm: n. 【數(shù)學(xué)】算法；規(guī)則系統(tǒng)；演段。
linear reinforcement learning algorithm: 線性再勵(lì)學(xué)習(xí)算法
learning algorithm: 演算法; 學(xué)習(xí)算法

下載手機(jī)詞典可隨時(shí)隨地查詞查翻譯

例句與用法

更多例句：上一頁(yè)

In this paper , introducing joint - action to the traditional reinforcement learning , a new multi - agent reinforcement learning algorithm based on behavior prediction is presented and several methods for predicting other agents " behaviors are discussed
在傳統(tǒng)強(qiáng)化學(xué)習(xí)方式中引入組合動(dòng)作的基礎(chǔ)上，本文提出了一種基于行為預(yù)測(cè)的多智能體強(qiáng)化學(xué)習(xí)方法，研究了對(duì)其他智能體行為進(jìn)行預(yù)測(cè)的幾種可行方法。
The reinforcement learning algorithm was also introduced , since it has some relations with the colony algorithm and can be need in the problem of scheduling . 4 . some new concepts and scheduling algorithms for batch chemical process were proposed in our studies
由于蟻群算法與人工智能中的強(qiáng)化學(xué)習(xí)算法之間有著某種聯(lián)系，同時(shí)強(qiáng)化學(xué)習(xí)近年來(lái)也應(yīng)用于求解調(diào)度問(wèn)題，因此本文也涉及到了一些強(qiáng)化學(xué)習(xí)的主要算法。
Reinforcement learning algorithms that use cerebellar model articulation controller ( cmac ) are studied to estimate the optimal value function of markov decision processes ( mdps ) with continuous states and discrete actions . the state discretization for mdps using sarsa - learning algorithms based on cmac networks and direct gradient rules is analyzed . two new coding methods for cmac neural networks are proposed so that the learning efficiency of cmac - based direct gradient learning algorithms can be improved
在求解離散行為空間markov決策過(guò)程( mdp )最優(yōu)策略的增強(qiáng)學(xué)習(xí)算法研究方面，研究了小腦模型關(guān)節(jié)控制器( cmac )在mdp行為值函數(shù)逼近中的應(yīng)用，分析了基于cmac的直接梯度算法對(duì)mdp狀態(tài)空間離散化的特點(diǎn)，研究了兩種改進(jìn)的cmac編碼結(jié)構(gòu)，即：非鄰接重疊編碼和變尺度編碼，以提高直接梯度學(xué)習(xí)算法的收斂速度和泛化性能。
By means of the proposed reinforcement learning algorithm and modified genetic algorithm , neural network controller whose weights are optimized could generate time series small perturbation signals to convert chaotic oscillations of chaotic systems into desired regular ones . the computer simulations on controlling henon map and logistic chaotic system have demonstrated the capacity of the presented strategy by suppressing lower periodic orbits such as period - 1 and period - 2 . meanwhile , the periodic control methodology is utilized , the higher periods such as period - 4 can also be successfully directed to expected periodic orbits
該控制方法無(wú)需了解系統(tǒng)的動(dòng)態(tài)特性和精確的數(shù)學(xué)模型,也不需監(jiān)督學(xué)習(xí)所要求的訓(xùn)練數(shù)據(jù),通過(guò)增強(qiáng)學(xué)習(xí)訓(xùn)練方式,采用改進(jìn)遺傳算法優(yōu)化神經(jīng)網(wǎng)絡(luò)權(quán)系數(shù),使之成為混沌控制器,便可產(chǎn)生控制混沌系統(tǒng)的時(shí)間序列小擾動(dòng)信號(hào),仿真實(shí)驗(yàn)結(jié)果表明它不僅能有效鎮(zhèn)定混沌周期1 、 2等低周期軌道,而且在周期控制技術(shù)基礎(chǔ)上,也可成功將高周期混沌軌道(如周期4軌道)變成期望周期行為。
L3ased on the organization rules of internet data , the distribution laws of hyperlinks and the name rules of url , a algorithm of tvm rebuilding is established , and satisfactory experiment results are obtained by applying this algorithm . furthermore , efforts are made by applying of tvm on browse navigation , web page classification and reinforcement learning algorithm
結(jié)合互聯(lián)網(wǎng)資源的構(gòu)建規(guī)則、鏈接分布規(guī)律和url命名規(guī)則，論文提出了樹(shù)藤共生數(shù)據(jù)模型的重建算法，實(shí)驗(yàn)結(jié)果驗(yàn)證了樹(shù)藤共生模型的有效性與合理性，在此基礎(chǔ)上初步討論了樹(shù)藤共生模型在瀏覽導(dǎo)航、網(wǎng)頁(yè)分類和reinforcementlearning算法中的應(yīng)用。

√在线天堂中文最新版网,97se亚洲综合色区,国产成人av免费网址,国产成人av在线影院无毒,成人做爰100部片

reinforcement learning algorithm中文什么意思

例句與用法

相關(guān)詞匯

相鄰詞匯

相關(guān)閱讀