Case-Based Reasoning VS Reinforcement Learning – Part 1 – Introduction

Introduction to the introduction

Greetings! Today I talk about the AI topic I’ m most concerned about nowadays. It’s RL, CBR, hybrid CBR/RL Techniques and the comparison of CBR and RL techniques. According to my information the Hybridization of CBR with RL is a relatively modern research topic (since 2005). However, I have never seen any document comparing RL to CBR (I’d love to see one).

I will give a slight introduction today.

Case based reasoning

Case-based reasoning (CBR) a Machine Learning Technique which aims to solve a new problem (case) using a database (case-base) of old problems (cases). So, it depends on past Experience.

CBR Cycle

CBR Cycle

Reinforcement learning

Reinforcement Learning (RL) is a sub-science of Machine Learning. It’s considered a Hybrid of supervised and unsupervised Learning. It simulates the human learning based on trial and error.

RL Cycle

RL Cycle

  • NB: For people unaware of the above two techniques, please read more about them before proceeding.

Can we compare CBR to RL?

What I think is … Yes we can, some people say “RL is a problem but CBR is a solution to a problem so how can you compare them?” I shall answer: “When I mean RL I mean implicitly RL techniques such as TD-Learning and Q-Learning”. CBR solves the learning problem by depending on past experience while RL solves the learning problem depending on trial and error.

How are CBR and RL similar anyway?

  • In CBR we have the Case-Base, In Reinforcement Learning we have the state-action space. Each case consists of the problem and its solution. In RL the action is considered the solution to the current state too.
  • In CBR there could be a value for each case that measures the performance of this case; in RL each state-action pair has its value.
  • In RL rewards from the environment are the way of updating the state-action pairs’ values. In CBR there are no rewards but after applying each case a revision process is performed to the case after testing it to update its performance value.
  • In CBR retrieving the case is done in RL under the name of “the policy of choosing actions from the action space”
  • In CBR adaptation is performed after retrieving the action, in RL no adaptation is performed because the action-space contains ALL the possible solutions.

And there are more examples to say but those are enough.

So Are CBR and RL techniques the same thing?

Of course not, there are many differences between them. CBR solves the learning problem by depending on past experience while RL solves the learning problem depending on trial and error.

What is the significance of hybrid CBR/RL Techniques?

It’s as if we make the computer learn BOTH by trial and error and past experience. Of-course this often leads to better results. We can say that both complete each other.

How is CBR hybridized with RL and vice versa?

CBR needs RL techniques in the revising phase. Q-learning is used in this paper: “Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL -2007″ , and another example is the master “A CBR/RL system for learning micromanagement in real-time strategy games – 2009″

RL needs CBR for function approximation; an example is the paper “CBR for State Value Function Approximation in Reinforcement Learning – 2005″

RL needs CBR for learning continuous action model (also approximation), an example is: “Learning Continuous Action Models in a Real-Time Strategy Environment – 2008″

Heuristically Accelerated RL – One of the Heuristics here is the case-base, an example is “Improving Reinforcement Learning by using Case Based Heuristics – 2009”

Continuous Case Based Reasoning – 1993

Experiments with reinforcement learning in problems with continuous state and action-spaces – 1997

A new heuristic approach for dual control – 1997

So the questions here are …

  • When to use Reinforcement Learning and when to use Case-Based Reasoning?
  • Are there Cases where Applying one of them is completely refused from the technical point of view?
  • If we are dealing with infinite number of state-action pairs, what is the preferred solution? Could both techniques serve well separately?
  • How can we make the best use of Hybrid CBR/RL approach to tackle the problem?

I’ll do my best to answer them in future post(s) .However; I’d be delighted if an expert could answer now.


11 thoughts on “Case-Based Reasoning VS Reinforcement Learning – Part 1 – Introduction

  1. well I dont know if you may compare this, because RL does not require ‘ANY’ information besides the reward signal and the ‘number’ of actions.

    however, I see your point and keep the good work up.
    nice indeed!

    but I cannot see any serious scientiffic attempt here in order of formal comparision…

    • sure this is not a precise scientific attempt YET, but there will be soon .

      Sorry i don’t understand what do u mean by “required information” ?? The comparison here is between RL Techniques -such as Q-learning- and Case-Based reasoning as a technique. Would u explain more please ?

  2. by “required information” he means things like this:
    – in CBR you need: a similarity measure, an adaptation procedure, a case representation language
    – in RL you need: a reward function, a state representation, an action representation, a transition function (which is typically implicit in the environment)

    All of the previous things can be seen as “knowledge containers”. For instance, just the way states are represented in RL gives a lot of information and introduces a bias over the set of policies that can be learnt. Imagine a situation where we want to make a system which learns to play an RTS game using RL. The set of states an RTS game can be in is huge, and thus typically, an abstraction is performed, and a reduced state space is used for RL. Thic captures a lot of domain knowledge (and thus, is considered as “required information”).

    But I agree with “me”, in that you need a more serious scientific approach to this. Your post is a good start, but you need to go deeper. Since your original question is quite theoretical (“what is the relation between CBR and RL”), start with understanding the theoretical foundations of the approaches. Not just a surface understanding of their algorithms. For instance, to understand the question that “me” asked you about “required information” you should read Michael Richter’s “knowledge containers” approach to see CBR systems.

    All in all, this is a very interesting and exciting topic. So, keep up with the good work, and hope to see your publications on this topic soon 🙂

    • Thank you a lot Santi for your very useful information 🙂

      Yes this is not – yet – a serious scientific approach, i was introducing the topic and collecting the feedback. I was really looking for an expert like you to comment and seems i was fortunate.
      I haven’t read about knowledge containers before but i will start to. I shall also understand the theoretical foundations more as u said to go deeper in the relation between CBR & RL.

      Thanks again for your guidance and encouragement:)

  3. Hi, nice article; here are my thoughts:

    – I’ve never heard of RL being a hybrid of Supervised and Unsupervised learning — how is it Supervised? How is it Unsupervised? Might want to double check this 🙂

    In “How are CBR and RL similar”:

    – I would think the CBR case base is more analogous to the policy in RL,
    not the state-action space.

    – I don’t understand the bit about no adaptation being performed in RL,
    again probably due to bad analogy with the action space.

    “So are CBR and RL the same thing?” — here you are just repeating
    your own definition from above. I think one could also make the
    argument that CBR learns by trial and error, while RL learns from past
    experience. Yes, those are deliberately switched.

    In “How is CBR hypdridized with RL”, you probably mean “can use”
    rather than “needs”.


    • Hello Arya,
      Firstly , Thxs for commenting
      -RL is the third type of Machine Learning . In Supervised learning a complete feedback is given from the environment to the classifier (The training data are some samples with their corresponding 100% known classes ). In Unsupervised Learning, No feedback or help is given from the enviroment (The Training data are just samples and the classifier classifies them into clusters based on similar traits between them) .On the other Side, In RL There is PARTIAL Feedback from the enviroment. When the RL Agent receives a reward, this reward partially contributes to the value of the state. That’s why i consider RL a hybrid.

      -I dont understand why do u think Case base is more similar to the policy in RL.Could u explain more ?

      -In RL , no adaptation is done “Explicitly” , meaning there is no process of adaptation. However adaptation is done implicitly by the feedback from the environment

      -Yes, I agree with u,In reality Both learn from both . I just wanted to emphasize on the obvious way they learn from.

      -No,Actually i mean “needs”, because for example : CBR is used for State Value Approximation in RL, So Yes RL needs CBR to overcome infinite value functions and make learning possible.

      Thank you again 🙂

      • Hi,

        Regarding RL as a Supervised + Unsupervised hybrid, I think I understand what you are saying now: Because supervised learning tells you the correct answer, and unsupervised tells you nothing about the answer, then a method like RL, which gives you some information that may lead you to the answer but does not give the answer explicitly, falls somewhere in between the two. But “hybrid” usually means a combination of two things — for example, I might say that Semi-supervised learning (where sometimes you receive the correct answer, and sometimes you receive no answer) was a hybrid of Supervised and Unsupervised learning. But I don’t think RL combines aspects of Supervised and Unsupervised learning, so I wouldn’t say RL was considered a hybrid of Supervised and Unsupervised learning any more than I would say Cats (four legs) are considered a hybrid of Humans (two legs) and Cockroaches (six legs.) Hope this helps 🙂

        Regarding the Case Base vs the Policy, I think I understand better what you meant now, but still it’s a little tricky. A policy is a mapping from distinct states to distinct actions, whereas the state-action space refers to every possible action in every possible state. So if you want to talk about a specific action in a specific state, or the best action currently known for a specific state, that comes from a policy or the best known policy.

        Regarding adaptation in RL, I agree with you — in CBR, a related case is retrieved from the Case Base, and then the previous solution is adapted to fit the current problem. In RL, the best action is retrieved, and that’s it — so yes, no explicit adaptation. So maybe instead of “in RL no adaptation is performed because the action-space contains ALL the possible solutions,” you mean, “in RL no adaptation is performed because the policy is defined ALL possible states/’problems’ “?

        Regarding “needs”, as in “RL needs CBR for value approximation”, do you mean “requires”? Because there are several methods for function approximation other than CBR. Or do you mean “would benefit from”?


      • Aha I understand now what you meant regarding the word “Hybrid”. Looks like its not the precise word here. Regarding Case Base VS Policy, hmm yeah comparing the Case Base with the policy itself is more precise than comparing the case base with the state action space. Regarding adaptation in RL, Yes i meant that. Regarding “needs” , Yes i agree with u that “would benefit from” or “requires” would be a better word to express what i mean .

        Thxs a lot for your comments, they were very useful. My Reinforcement Learning Brain learned stuff from that 😀

        If I write about this topic again,then allow me to let you know about it to hear your comments again 🙂

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s