sql - How to get almost matching string from Oracle table? -
i have table in oracle 4 columns.
now user can enter input string "operation knee right" (which valid) query , query should return icd code (ikr123) matches of word in diagnosisname column.
following current query.(not giving proper output)
select diagnosisname (select diagnosisname, utl_match.jaro_winkler_similarity('%operation knee right%',diagnosisname) icd_code order utl_match.edit_distance_similarity('%operation knee right%',diagnosisname) desc ) rownum<2;
this query giving me output "left knee operation" expectation "right knee operation".
there few things note usage of utl_match:
- edit_distance_similarity : returns integer between 0 , 100, 0 indicates no similarity @ , 100 indicates perfect match.
- jaro_winkler_similarity : returns integer between 0 , 100, 0 indicates no similarity @ , 100 indicates perfect match tries take account possible data entry errors.
order utl_match.edit_distance_similarity('%operation knee right%',diagnosisname) desc
this not going give correct result. since, considering possible similarity, however, not considering data entry errors. so, must use jaro_winkler_similarity.
operation knee right
you need keep in mind case of input , column values compare. must in similar case correct match. passing input in lowercase, however, column values in initcap. better convert both column values , input similar case.
let's @ below demonstration understand:
sql> data as( 2 select 'heart operation' diagnosis_name, 'ih123' icd_code dual union 3 select 'knee operation' diagnosis_name, 'ik123' icd_code dual union 4 select 'left knee operation' diagnosis_name, 'ikl123' icd_code dual union 5 select 'right knee operation' diagnosis_name, 'ikr123' icd_code dual union 6 select 'fever' diagnosis_name, 'if123' icd_code dual 7 ) 8 select t.*, 9 utl_match.edit_distance_similarity(upper(diagnosis_name),upper('operation knee right')) eds, 10 utl_match.jaro_winkler_similarity (upper(diagnosis_name),upper('operation knee right')) jws 11 data t 12 order jws desc 13 / diagnosis_name icd_co eds jws -------------------- ------ ---------- ---------- right knee operation ikr123 20 72 knee operation ik123 20 70 heart operation ih123 25 68 left knee operation ikl123 25 64 fever if123 15 47 sql>
so, see how both different each other. jaro_winkler_similarity better job in identifying data entry errors , giving most close match. based on that, choose first row after sorting in descending order:
sql> data as( 2 select 'heart operation' diagnosis_name, 'ih123' icd_code dual union 3 select 'knee operation' diagnosis_name, 'ik123' icd_code dual union 4 select 'left knee operation' diagnosis_name, 'ikl123' icd_code dual union 5 select 'right knee operation' diagnosis_name, 'ikr123' icd_code dual union 6 select 'fever' diagnosis_name, 'if123' icd_code dual 7 ) 8 select diagnosis_name 9 10 (select t.*, 11 utl_match.edit_distance_similarity(upper(diagnosis_name),upper('operation knee right')) eds, 12 utl_match.jaro_winkler_similarity (upper(diagnosis_name),upper('operation knee right')) jws 13 data t 14 order jws desc 15 ) 16 rownum = 1 17 / diagnosis_name -------------------- right knee operation sql>
Comments
Post a Comment