Fuzzy answers

otacke's picture

Last week, I chatted with a lecturer who was quite impressed with H5P. He wondered if if was possible to use fuzzy answers within clozes aka Fill in the Blanks. It was not, but it's basically no big deal, so I implemented it this weekend. You can either get the source code at github (make sure to grab https://github.com/otacke/h5p-blanks/tree/fuzzy_comparison and the new library https://github.com/otacke/h5p-text-utilities as the latter might be also used for different stuff), or you can use the demo package that I attached to this post - but please beware of using this on a productive system!

I have not created a pull request yet, because I'd like to get your opinions on how to implement it best for the user. There's probably nothing to do for the front end (besides optional hints to the exact answer), but I wonder what would be the best way to make it easy for the teaching user to select the proper options. This could range from a simple on/off switch to even fine tuning details of the algorithms that I used, or even combining them. Right now, I aimed for something in between in the behavioral settings.

First, you can define a maximum number of operations that would be necessary for transforming the answer given to the correct one (deleting, inserting, exchanging or swapping a character). This option uses the Damerau-Levenshtein distance. For example, if you allow a maximum of 1 operation and the correct answer was "Einstein", it would also be okay to have a typo like "Einstien" where i and e are swapped. However, this wouldn't work for accepting "Schroedinger" instead of "Schrödinger", because 2 operations would be required, e.g. replacing the "o" with "ö" and deleting the "e".

Second, you can set a minimum threshold percentags of similarity for still counting as correct. This option uses the Jaro-Winkler distance. This would work given the "Schrödinger" vs. "Schroedinger" example above, but may have trouble in other cases that the previous method works better for.

For now, you can switch both options on and off. If using both, the given answer will be correct if at least one of both methods says it is.

Your ideas are welcome!

 

0
0
Supporter votes Members of the Supporter Network can vote for feature requests. When the supporter network has generated sufficient funding for the top voted feature request it will normally be implemented and released. More about the H5P Supporter Network
tomaj's picture

Very nice stuff Otacke, I'm thoroughly impressed!

- Tom

fnoks's picture

Nice, otacke :) This is a really useful addition!

I haven't looked at the implementation, only tested it. Seems to do the job well :)

My only input at this point would be to do some changes to the semantics (labels and descriptions). E.g, I don't think the average author will understand what fuzziness means, the same goes for Damerau-Levenshtein/Jaro-Winkler.

otacke's picture

Thanks. The labels were just temporary anyway. I am going to make them more legible, create a pull request, and change anything else that you may find when having a closer look.

falcon's picture

Ver cool, nice work!

Perhaps we should have just a checkbox with the label "Allow typos" and the description "Typos allowed for words longer than three charecters". Could have an advanced settings group as well to adjust algorithm usage, but I'm not sure how useful that would be.

We could then use Damerau-Levenshtein with the number of operations depending on the length of the word, or Jaro-Winkler with ~0.75 as defaults?

I made something like this in another project where I think I allowed 1 typo from 4 - 9 charecters and two from 10. I only checked for misplaced charecters though, so switched charecters would probably count as two mistakes. It seemed to work well. Never got any complaints.

otacke's picture

Well, I always like to have some more options ;-) But I also understand that this might not be a good idea in general. I like the idea of advanced settings, but I don't cling to it. Just give me a hint when you have decided at Joubel, and I'll gladly change the code.

With setting a fixed threshold for Jaro-Winkler, it's kind of tricky. It really depends on the use case, the domain and personal preferences, I guess.

Oh, and all it takes is flipping a true to a false in order to make Damerau-Levenshtein the good old Levenshtein without swapping of characters :-) The same goes for Jaro and Jaro-Winkler. I already restricted the options for the users ;-)

thomasmars's picture

Yes, I appreciate the flexibility of the implementation.
Makes it really easy to fine tune the functionality, whichever is most appropriate.

papi Jo's picture

Hi Otacke,

I have just tested your "fuzzy answers" feature, which works as expected. However I wonder what real world use it can have. As a former language teacher I actually can see no usefulness in accepting faulty answers as correct. I'd be interested to hear of concrete, contextualized, examples where fuzzy answers can be useful.

Thanks!

otacke's picture

As I mentioned briefly in my first post, I was asked for that feature by a lecturer who's a physicist. I guess he had something in mind when proposing those fuzzy answers, but I honestly didn't ask. Maybe I should in fact have done this, I see.

I came up with two use cases myself, typos and similar spellings. For example, if in a chemistry quiz I typed "deoxiribonucleid acid" instead of "deoxyribonucleid acid", I probably meant the right thing although I misspelled it. A teacher might decide that wrong spelling should not lead to a wrong answer in this case. That's not up to me to decide. The same goes for names such as "Schrödinger", "Schroedinger" or "Schrodinger". While you could also give alternative spellings explicitly, this might be tedious in some cases. For instance, transcribing Russian names might be tricky.

I don't feel passionate about the feature one way or the other.

thomasmars's picture

I can see this being very useful in complex answers where an exact answer is not necessary. I think it could be very useful as an optional feature if the author is aware of how and when to use it.
On the other hand I think we should be cautious about adding too many complex features without having good ways of shielding new authors from this, I think too many features can be quite intimidating.

otacke's picture

When you come around to checking the pull request and decide that you think there's a better/easier way, just let me know. I'll gladly change the code.

fnoks's picture

You will hear from us :) Thanks for the pull request. We will get an English native speaking to look at the labels and description.

otacke's picture

Sounds like a sensible idea ;-) And please tell me if I should change the scope of the options.

fnoks's picture

We will - stay tuned :)

otacke's picture

I may have an interesting option for the "fuzzy feature" and I'd like your feedback.

I am currently enrolled in a course about machine learning, and I finally got around to basically finish the proposal for my capstone project yesterday. It has nothing to do with H5P, but part of the project might benefit from the string distance functions that I used. However, I woke up at about 5:30 this morning and wondered whether I might ditch my proposal and write a new one.

Using the fuzzy option in H5P poses the need to balance simple usage and flexibility or at least effectiveness. I think if the results were really good, it would be totally fine if you could just switch "fuzziness" on and off without further options. Question is: Which parameters for the distance functions or which combinations work best? There is neither an easy answer nor an absolute one, because it will depend on the kind of data (or words) that you're using. Still, there's room for experimentation, and there might be some solutions that tend to be better than others. OK, let's get to the point.

I might propose to optimize a combination of string distance measures for usage in H5P clozes. It's basically building and training a set of classifiers and evaluating which one of them works best. There are still some thing's I'd have to figure out. For example, I'd have to find a suitable labeled dataset or build my own (could be a nice little crowd sourcing project on its own, maybe gamified, where people get two similar words and would have to tell whether they'd rate them identical/similar enough or not), I'd have to find a benchmark somewhere (some learning management systems might have a similar fuzzy option for comparison). Also, the proposal would have to be accepted first - but I'd have my current proposal as backup anyway.

It might be a little overkill for such a tiny function in H5P, but why not? ;-) What do you think?

update: I'll probably proceed as I had in mind - I have some cool ideas - but not for the capstone project. Would definitely take longer than my other proposal. Time is money in this case :-)

thomasmars's picture

It's definitely an interesting approach, I guess one of the big challenges would be finding relevant training data.

Good luck :)

papi Jo's picture

"I guess one of the big challenges would be finding relevant training data."

Yes, totally agree.

otacke's picture

I identified that obstacle as well - but I already have a plan involving crowd sourcing ... 

thomasmars's picture

Excellent, I'm looking forward to see your progress if you decide to pursue this. Don't hesitate to ask any questions if you think there's something we can provide you with.

otacke's picture

Thanks! I guess some promotion would be cool as soon as I am ready for gathering data.