No announcement yet.

Learning in Multi-Linked Manipulator Control

This topic is closed.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Learning in Multi-Linked Manipulator Control

    Dear Biomch-L readers,

    One of our subscribers asked me last evening to be removed from the list
    because its apparently too medical contents. In the wake of the recent
    postings on X-Ray System Collision Detection & Avoidance and on Artificial
    Neural / Logic Networks, I thought that the below X-posting might set the
    balance straight.

    For those of you who do not have FTP, the BITFTP@PUCC.BITNET service can
    be useful. On the Internet, is a recent alternative
    (send HELP in the body of an email note).

    Regards -- hjw.
    Article 5759 in
    From: (C.K. Tham)
    Subject: Reinforcement Learning for Robot Control
    Date: 1 Sep 92 13:10:22 GMT
    Sender: (C.K. Tham)
    Organization: Cambridge University Engineering Department, UK


    The following technical report is available via anonymous ftp:

    by Chen K. Tham & Richard W. Prager.
    (Technical Report CUED/F-INFENG/TR104)


    We present a trajectory planning and obstacle avoidance method which uses
    Reinforcement Learning to learn the appropriate real-valued torques to
    apply at each joint of a simulated two-linked manipulator in order to move
    the end-effector to a desired destination in the workspace. The inputs to
    the controller are the joint positions and velocities which are fed
    directly into a Cerebellar Model Arithmetic Computer (CMAC) (Albus,75). In
    each state, the expected reward and appropriate torques for each joint are
    learnt through self-experimentation using a combination of the Temporal
    Difference (TD) technique (Sutton,87) and stochastic hillclimbing
    (Williams,88). Actions which cause the manipulator to reach the desired
    destination are rewarded whereas actions which lead to collisions with
    either joint limits or obstacles are punished by an amount proportional to
    the velocity before collision. After training, the manipulator is able to
    move along smooth collision-free paths from different start positions in
    the workspace to the destination.

    The file is in compressed Postscript format (length 961935 bytes).

    Procedure for obtaining the report:

    unix> ftp
    Name: anonymous
    Password: (your e-mail address)
    ftp> cd reports
    ftp> binary
    ftp> get
    ftp> quit
    unix> uncompress
    unix> lpr .. etc. .. to print

    The authors welcome comments and suggestions from readers.

    Chen K. THAM, E-mail:
    Speech, Vision and Robotics Group, Tel. : +44 223 332754
    Cambridge University Engineering Department, Fax : +44 223 332662
    Trumpington Street,
    Cambridge CB2 1PZ,
    United Kingdom.