FiveTech Software tech support forums

by **Antonio Linares** » Tue Apr 23, 2019 9:53 pm

Here you have a very simple example to implement Automated Machine Learning using Harbour.

Imagine that we have a robot and we want the robot to learn the most efficient way to exit from a house.
So we place the robot in a random room and the robot tries to exit from the house.

The robot learns from the successes and the failures, and starts improving. In a few tries the robot knows
the most efficient way to exit from the house :-)

If the robot starts in the room 1 or in the room 3, there is nothing to choose. From the room 1 he can only goes to room 5.
From the room 3 he can only go to room 4.

If he is in the room 2, then he has two options, to exit from the house (he does not know it is the exit but he will get rewarded)
or he may go to room 4.

A similar situation happens in the room 5. he may exit from the house (and get rewarded) or he may go to room 4 or room 1.

When the robot is in the room 4 he has two alternatives to exit: going through room 5 or going through room 2. Both ways
will take him out of the house in the same amount of steps.

When the robot exits the house, he gets a positive reinforcement (the State's nScore increases in 0.1). When he fails, he gets
a negative reinforcement (the state's nScore decreases in 0.1). After a few tries, the robot learns the right way.

qlearning.prg

Code: Select all Expand view RUN: #include "FiveWin.ch" static aStates //----------------------------------------------------------------------------// function Main() local n aStates = InitStates() for n = 1 to 10 ? "Number of steps to exit", SolveIt() next XBrowse( aStates ) return nil //----------------------------------------------------------------------------// function SolveIt() local oState, nSteps := 0 oState = aStates[ hb_RandomInt( 1, Len( aStates ) - 1 ) ] while ! oState == ATail( aStates ) XBrowse( oState:aOptions, "Options for room: " + AllTrim( Str( oState:nId ) ) ) oState = oState:NextState() nSteps++ end return nSteps //----------------------------------------------------------------------------// function InitStates() local aStates := Array( 6 ) AEval( aStates, { | o, n | aStates[ n ] := TState():New( n ) } ) aStates[ 1 ]:AddOption( aStates[ 5 ] ) aStates[ 2 ]:AddOption( aStates[ 4 ] ) aStates[ 2 ]:AddOption( aStates[ 6 ] ) aStates[ 3 ]:AddOption( aStates[ 4 ] ) aStates[ 4 ]:AddOption( aStates[ 2 ] ) aStates[ 4 ]:AddOption( aStates[ 3 ] ) aStates[ 4 ]:AddOption( aStates[ 5 ] ) aStates[ 5 ]:AddOption( aStates[ 1 ] ) aStates[ 5 ]:AddOption( aStates[ 4 ] ) aStates[ 5 ]:AddOption( aStates[ 6 ] ) aStates[ 6 ]:AddOption( aStates[ 2 ] ) aStates[ 6 ]:AddOption( aStates[ 5 ] ) return aStates //----------------------------------------------------------------------------// CLASS TState DATA nId DATA aOptions INIT {} DATA nScore INIT 0 METHOD New( nId ) INLINE ::nId := nId, Self METHOD AddOption( oState ) INLINE AAdd( ::aOptions, oState ) METHOD NextState() ENDCLASS //----------------------------------------------------------------------------// METHOD NextState() CLASS TState local n, nMax := 0, oState, aOptions if Len( ::aOptions ) == 1 oState = ::aOptions[ 1 ] else aOptions = AShuffle( AClone( ::aOptions ) ) for n = 1 to Len( aOptions ) if aOptions[ n ]:nScore >= nMax nMax = aOptions[ n ]:nScore oState = aOptions[ n ] endif next endif if oState:nId == 6 ::nScore += .1 else ::nScore -= .1 endif return oState //----------------------------------------------------------------------------// function AShuffle( aArray ) return ASort( aArray,,, { || HB_RandomInt( 1, 1000 ) < HB_RandomInt( 1, 1000 ) } ) //----------------------------------------------------------------------------//

After a few tries the robot starts learning, we can check it reviewing the different state's rewards:

When the robot is in room 5, he has three alternatives: room 1 (reward value is -0.3), room 4 (reward value is -0.3) and option 6 (reward zero)
so he takes the highest value and knows that option 6 is the right one ;-)

by **Antonio Linares** » Wed Apr 24, 2019 9:30 am

The robot is at room: 5
He has these choices:
Go to room: 1 (score: 0 )
Go to room: 4 (score: 0 )
Go to room: 6 (score: 0 )
The robot is at room: 4
He has these choices:
Go to room: 2 (score: 0 )
Go to room: 3 (score: 0 )
Go to room: 5 (score: -0.1 )
The robot is at room: 3
He can just go to room: 4
The robot is at room: 4
He has these choices:
Go to room: 2 (score: 0 )
Go to room: 3 (score: -0.1 )
Go to room: 5 (score: -0.1 )
The robot is at room: 2
He has these choices:
Go to room: 4 (score: -0.2 )
Go to room: 6 (score: 0 )
Number of steps to exit: 5

The robot is at room: 3
He can just go to room: 4
The robot is at room: 4
He has these choices:
Go to room: 2 (score: 0.5 )
Go to room: 3 (score: -0.5 )
Go to room: 5 (score: 2.3 )
The robot is at room: 5
He has these choices:
Go to room: 1 (score: -0.4 )
Go to room: 4 (score: -1.1 )
Go to room: 6 (score: 0 )
Number of steps to exit: 3

FiveTech Software tech support forums

Automated Machine Learning using Harbour

Automated Machine Learning using Harbour

Re: Automated Machine Learning using Harbour

Who is online