PDA

View Full Version : Postgres Regex to find Non-alphanumeric characters


seayakin
14th July 2011, 01:37 PM
Anyone out there know how to construct regex to find (ideally replace) non-alphanumeric characters in a column?

I've been looking here (http://www.postgresql.org/docs/8.3/static/functions-matching.html) and have been struggling. I never fully understand the syntax for regex so that is probably part of my problem and have only gotten others to work largely by trial and error.

Tried
SELECT * FROM my_schema.my_table WHERE barcode ~* '.*\k.*'

SELECT * FROM my_schema.my_table WHERE barcode ~* '.*\w.*'

BowlOfRed
14th July 2011, 01:59 PM
Not using postgres, so can't test. But I would suggest:

Character class -> []
Class containing alphanumerics -> [[:alnum:]]
Class containing everything but that -> [^[:alnum:]]

Match anything that has a non alpha-numeric
... ~* '[^[:alnum:]]'

Or maybe replace [:alnum:] with A-Za-z0-9 if that's what you mean and you don't want to take "special" alphanumeric characters.

not daSkeptic
14th July 2011, 02:59 PM
The following will find columns with non-alphanumeric characters:

select * from t where values ~* '[^a-z0-9]';

The following will return only the non-alphanumeric characters:

select substring(values from '[^a-z0-9]') from t where values ~* '[^a-z0-9]';

seayakin
14th July 2011, 05:28 PM
The following will find columns with non-alphanumeric characters:

select * from t where values ~* '[^a-z0-9]';

The following will return only the non-alphanumeric characters:

select substring(values from '[^a-z0-9]') from t where values ~* '[^a-z0-9]';

Thanks that is an obviously simple solution I overlooked.