To be honest: what you describe is something completely different than the treatment of multi-value fields.
Here is a thread about how to remove duplicate words from a string:
BUT: this function takes the words literally, there is no interpretation that something might be similar to something else. But have a look at it, perhaps you get some of the cases solved.