DiffMerge hangs when opening ANSI files in Unicode mode

aedell · Post by **aedell** » Wed Jan 05, 2005 2:00 pm

I was not able to Show Differences on two versions of a text file that was created in notepad, and saved as encoding type "ANSI"... DiffMerge would launch, but would remain blank, and my PC's CPU was hung at 100% - I had to force quit DiffMerge.

A colleague tried a DiffMerge on the same file and his worked fine - DiffMerge launched and displayed the file differences.

I logged in to my PC as a different user, and tried DiffMerge again, and it worked.

However, I then changed the DiffMerge Character Encoding options to "Unicode" and tried to do a DiffMerge on the same ANSI encoded text files, and DiffMerge "hangs" - forcing me to force quit.

So, the problem seems to be that DiffMerge does not gracefully handle opening an ANSI encoded file when it expects a UNICODE one.

Which leaves us with the following problem: sometimes we need to use the Unicode mode and sometimes ANSI, but we won't know until DiffMerge hangs, force quits, open it using an encoding we're sure of, change the Encoding option, and re-try Show Differences.

See related posting about T-SQL scripts looking garbled using DiffMerge - because T-SQL scripts were encoded using SQL Servers "Unicode" option.

Is there a workaround for this problem? Can this problem be addressed in another way? Which mode should I leave DiffMerge in to avoid the hang or crash?

(using DiffMerge 1.10 (2752) with Vault 3.0.0 Client)

dan · Post by **dan** » Thu Jan 06, 2005 8:57 am

I haven't been able to reproduce this with some test ANSI files that I have been using. Can you forward to us the 2 files you are using so we can reprodcue it here?

Thanks,

aedell · Post by **aedell** » Thu Jan 06, 2005 12:37 pm

OK, I'll be sending you the files separately. I'd like to point out that the two files are actually two versions of the same file, as uploaded into Vault. So, in order to reproduce the bug (as I see it), you'll need to add the first file to vault, check it in, check it out, modify it with the 2nd file, check it in, Show History, then select the two versions, and select Show Differences, while the mode is set to Unicode previously.

Guest · Post by **Guest** » Thu Jan 06, 2005 1:17 pm

Just to clarify lest any readers be confused, neither ANSI nor UNICODE are character encodings.

ANSI is a standards body that makes lots of standards. Microsoft uses ANSI as a (weird) shorthand for whichever 8-bit character set and character encoding is default for the thread/user/system.

UNICODE is a character set, which has three or four major encodings. Microsoft often misuses Unicode to mean one of the two particular encodings on which they based Windows NT-4: UCS-2LE or UTF-16LE.