Page 1 of 1

Bad word filter is creating problems

Posted: Mon Feb 05, 2007 7:55 am
by abijah
Script URL: currently only in localhost domain not yet bought
Version of script: latest
Hosting company: localserver
URL of phpinfo.php:
URL of session_test.php:
What terms did you try when SEARCHING for a solution:

Write your message below:

Dear Sir,

I am currently trying to check script properly, I found every thing was fantastic. But this word filter is creating problem. I need it because I don't want bad words to be published but it even filters words like "Causes" etc.

How can I prevent this to happen? I have never edited badwords list page "en.php" it's same as you added.

Please help me.

Posted: Mon Feb 05, 2007 10:02 am
by Klemen
Doesn't block "Causes" here so I don't know why it would on your server?
http://www.phpjunkyard.com/mboard/msg/7854.html

Are you sure you have the 1.3 version?

how come?

Posted: Mon Feb 05, 2007 10:54 am
by abijah
don't know what to do now

Posted: Mon Feb 05, 2007 11:33 am
by Klemen
Well when you have it up and running post a link to your MBoard (as well as phpinfo file) and I will test it from there.

Solved

Posted: Mon Feb 05, 2007 4:28 pm
by abijah
The mistake was in word filter. preg_replace checks the word a.s.s with causes and finds that it just fits inside.

It causes after filter is shown as ca** because it fits in a.s.s dot is like wild card in preg_replace. the big words like a.s.s.h.o.l.e will not create problem because that word will hardly fit with any other word but small word "a.s.s" fits in lots of words. And I found that I have mboard's just previous version not latest. I downloaded it just few days before you released new version. but the thing is not changed in new mboard also. I don't know why is your server not working like all my servers. may be we have diff version of php so preg_replace works differently

Better you also remove word in filter with dot like a.s.s bigger ones will not create problem.

this is what i found.

Posted: Mon Feb 05, 2007 5:51 pm
by abijah
First of all I am sorry for bad words.

Result of "causes, asses, ass , ass, ass hole, asshole, a s s h o l e, embarrasses, a.s.s.h.o.l.e"

with this code "$text = preg_replace("/$k/i",$v,$text);" using preg_replace the result of words.

output was "ca**, a**, a** , ass, a** hole, a******, a** h o l e, embarra**, a**.h.o.l.e"

with this code "$text = str_replace("$k",$v,$text);" using str_replace the result of words.

output was "causes, asses, a** , ass, a** hole, a** h o l e, embarrasses, a**.h.o.l.e"

So I think better use str_replace. is there any undesired side effect that you know with this change?

Posted: Mon Feb 05, 2007 7:43 pm
by Klemen
Hi,

Well you obviously aren't using the latest version :roll:

The str_replace isn't a good solution because it is CaSe SenSitiVe so "ass" would be blocked, while "ASS" wouldn't. The case in-sesnsitive str_ireplace is PHP 5 only.

Try changing it to the code version 1.3 has:

Code: Select all

$text = preg_replace("/\b$k\b/i",$v,$text);
(note the \b before and after $k)

Thanks

Posted: Mon Feb 05, 2007 7:51 pm
by abijah
Thanks man!

I never think about upper and lower case diff. I would be easy fool that way :lol: :lol: :lol:

suggestion

Posted: Mon Feb 05, 2007 8:00 pm
by abijah
But one suggestion if I am not wrong.

You have kept in bad word list as

"a s s" => "a**"
"xxx" => ""xxx"
"blah" => "blah"
"blah" => "blah"
"a s s h o l e" => "a*****"

Here the last one doesn't have meaning. Because by the time compiler reaches to the last one the a s s part will already be a** so final product will look like "a** h o l e"

so you better put that bigger one ie. "a s s h o l e" above that "a s s" in the array list.

I hope you are not getting bored of my posts, I may find other errors or I my want to suggest better alternative if I know and find some. Should I post? though I may be wrong sometimes :lol: though I too am not very bad at php.

Posted: Mon Feb 05, 2007 11:03 pm
by Klemen
Hi,

Not really, if you check the regular expression code

Code: Select all

\b$k\b
you will see the \b sign which stands for word boundaries.

With the above code
"ass" => "a**"
would block "ass" but not "asshole" (nor Kansass, casses and similar). That's why "asshole" is required just as "ass" is.

You can test it by removing "asshole"=> from the bad words list and you will see it isn't being rewritten as a**hole. It was with the old code, but not with the "\b" one.

Oh and don't worry, I am no super expert in PHP either so any better alternatives or suggestions for improvements are always welcome :D