Feel free to repurpose this as needed (create a function for it)
DECLARE @original NVARCHAR(MAX) = N'';
DECLARE @cleaned NVARCHAR(MAX) = N'';
WITH LowAscii AS
(
SELECT CASE WHEN ASCII(SUBSTRING(@original, 1, 1)) > 127
THEN ''
WHEN PATINDEX('%[^A-Z0-9 _-]%', SUBSTRING(@original, 1, 1)) > 0
THEN ''
ELSE SUBSTRING(@original, 1, 1)
END AS OneChar, 1 AS Idx
UNION ALL
SELECT CASE WHEN OneChar = '' AND SUBSTRING(@original, Idx+1, 1) = ' '
THEN ''
WHEN ASCII(SUBSTRING(@original, Idx+1, 1)) > 127
THEN ''
WHEN PATINDEX('%[^A-Z0-9 _-]%', SUBSTRING(@original, Idx+1, 1)) > 0
THEN ''
ELSE SUBSTRING(@original, Idx+1, 1)
END, Idx+1
FROM LowAscii
WHERE Idx < LEN(@original)
)
SELECT @cleaned = @cleaned + OneChar from LowAscii Option (MaxRecursion 32767);
Items of interest:
- The CTE (common table expression) works great in situations that require recursion. Here we're recursively calling ourself in order to parse individual characters in a string.
- PATINDEX is the way you perform RegEx-ish LIKE queries.