WCopyfind is an open source windows-based program that compares documents and reports similarities in their words and phrases. It is free and available to anyone. It is licensed under the Gnu Public License, which basically means that you can do whatever you like with it except to try to sell it to someone else.
Download WCopyfind 4.1.5 Executable
Download WCopyfind.4.1.5 64-Bit Executable
Unlike most modern software packages, WCopyfind is a single executable file. You don’t install it, you just run it. Simply click on the link to download the executable file. If you’re running a 64-bit version of Windows, you can select the 64-bit executable, which runs about 10-20% faster than the 32-bit version. Place that file in a convenient location and double-click on it to execute it. It stores its setting settings in the windows registry, but otherwise it doesn’t have any lasting effect on your computer.
View WCopyfind Instructions
WCopyfind is pretty simple to use, but some of the settings need explanations.
Download WCopyfind 4.1.5 Source
As open source software, you’re welcome to tinker with WCopyfind to add features or make it behave differently.
How WCopyfind and Copyfind Work
I have been asked many times to explain how these comparison programs work and how they manage to complete their work so quickly. At long last, I have written an explanation of those programs. For more information, you’ll simply have to read the source code, which is now much clearer than it used to be.
- WCopyfind 2.7
- WCopyfind 3.0
- WCopyfind 3.0.1
63 thoughts to “WCopyfind”
I downloaded this but got a strange Chinese warning when I ran it and did not get any comparison stats even though the application keeps saying it’s comparing documents. Tried several different docs and formats. This doesn’t seem to work.
The “Chinese warning” that you saw is probably an internal error message that’s not displaying properly. Did you create a reporting folder? If not, you will definitely get an error. The error message should be readable, so I must have a bug to fix. I’ll take a look…
I found and fixed the bug that produced the “Chinese warning” message. Erik had indeed neglected to create a report folder, but the error message that popped up was garbled. I fixed the error message and uploaded WCopyfind 3.0.1 to this web site.
Hi, I downloded 3.0.1 and it worked for some doc and docx documents. Very big thank .. However, it didn’t work for some other doc and docx documents. Some comparison results also came up with ASCII words. Thanks.
Could you please tell me what the symptoms were for the .docx files that didn’t work? Did they not read in at all or did they read in but not match what they should match?
I am aware that WCopyfind won’t read some .doc files properly. The program doesn’t actually understand the .doc formats, which are diverse and too complicated for me to decode without spending more time than I can afford. If that causes problems for you, the easiest solution is to convert those .doc files to .txt or .docx, using the latest versions of Word.
The symptoms were the software produced report without any results or only five to six comparison while I compared 20 documents. These still occurred even tough I converted those document (both on doc or docx files) to .txt files. I can send you the samples of those document to your email if these can help (would you mind to send me your email address).
I get the following message when trying to open the exe:
The procedure entry point DecodePointer could not be located in the dynamic link librar KERNEL32.dll.
I’m not sure what’s causing this error, but I’ll try to find the problem.
Hello. I’m a teacher interested in using WCopyFind to detect plagiarism. I opened WCopyFind and got the popup box with “Old Document Files” and “New Document Files”…is this program just for comparing 2 documents? I’m interested in uploading a student paper and WCopyFind searching the internet for possible plagiarism. Can this program do that? Thanks!
WCopyfind can only compare documents that you have on hand, although it can compare far more than just two such documents. It can easily handle hundreds, thousands, or even more documents, limited only by the memory on your computer and your patience.
Scouring the internet for documents is a much more complicated task and one that typically requires a paid service provider. WCopyfind can’t do that sort of work. However, if you can locate the source(s) for a suspicious document, perhaps by Googling specific phrases that don’t sound original, you can download those sources and then feed them to WCopyfind, along with the suspicious document and WCopyfind can then do the matching.
Hello,I downloaded the source of WCopyFind but I need to know which algorithm is used in comparing documents?? Thanks alot for help!
I have finally written an explanation of the comparison process. You can read it here.
Hi Professor Bloomfield,
You program is perfect:)
We want to use your program as an automatic plagiarism detection tool for the whole school. I know it is easy but I do not know c++ 🙁
We save the file links in a text document, and we want copyfind to compare the uploaded file (from browser) with the files written in text file. We can send the information via curl or use shell to send commands with using php. Our server is linux.
In short, we have to use a shell command or web form to send input to the copyfind program instead of windows UI. Is there anybody who compiled source code so that it can work in linux machine with using linux shell?
I will post Copyfind.4.1.0 as soon as I have time to write instructions for it. It can read simple commands from standard input or from a file, so it’s probably just what you want. I has almost zero windows-specific code in it and I may be able to edit that code out, so as to produce a machine-independent program. That last step, however, is going to take some time that I don’t have right now. I’m at the “anaerobic” time of the semester, where all I can do is try to keep from collapsing. But these tasks are definitely on my short list of things to do.
I have similar stuff developed for Linux environment, if you wish to use reach me on prince0206 in google mail.
Thanks a lot for your quick reply and great effort 🙂 sending input and run the program via linux shell command is enough for me.
I just want to say that I introduced your program to my colleagues, and they are so excited 🙂
We do not know c++ and we cannot edit source codes. Therefore, we are anxiously waiting for your machine independent program.
Wish you the best,
I just figured out that wcopyfind cannot open word documents written with macintosh word program. Program simply crash when there is macintosh word file. I just wanted to report this. Program may show an ignore message and continue, or just pass the document and show ignored documents in log file (the one that is auto created)
Thanks for the heads-up. Do the macintosh word files have an extension (e.g., .docx or .doc)? If they don’t, then my program will definitely have trouble with them. In any case, I should make my program handle them gracefully, even if it can’t read them properly.
Macintosh word document’s extension is .doc It is possible to open it in microsoft word. But Wcopyfind cannot open it and crash 🙂
Great program. Thanks for taking the time to write it, maintain it, and share it with us. It is exactly what I was looking for.
As for the .doc issue, you might refer people here: http://blogs.msdn.com/b/ericwhite/archive/2008/09/19/bulk-convert-doc-to-docx.aspx This will allow bulk conversion from doc to docx, eliminating the issue of you needing to parse .doc files.
Thanks for the links to .doc conversion ideas. When I get a chance, I’ll take a look.
Hi and thanks so much for this tool.
I have just a comment,when you say “It is licensed under the Gnu Public License, which basically means that you can do whatever you like with it except to try to sell it to someone else”
I’m afraid this is against the 4 free software freedom you granted users with a GPL license,what is not allowed is to sub licensed your application but a derivative work can be developed and sold under GPL.
That said, obviously nobody would sell something that is proposed for free !
I’ll have to take another look at the GPL. I wrote the original version of this software back when the United States still supported public higher education and I didn’t need the money. Now that the US is abandoning public education and we have gone 5 years without raises, I might rethink this.
First of all, thank you so much for this tool.
I am writting a dissertation for my degree and I would like to make sure I do not have any text match.
Unfortunately, it does not work. The report says there is no matching. As I was a bit suprised I copy pasted a full paragrah from a web site and still no match.
My dissertation is written in French so I chose French langage. I also lowered all the thresholds. Nothing came up.
Thank you for your help
I’m not sure what’s causing the troubles you’re seeing. I suggest that you try comparing two identical files containing a long document (such as your dissertation) in Microsoft Word .DOCX format. If my program doesn’t find those two copies of the same thing matching, that indicates a serious problem. I’m more expecting that it will work and find huge matches in the documents. Then work backwards toward the comparison you want: paste the stuff you want to compare at the ends of the two copies and run the comparison again. I’m hoping that you’ll see the matching appear.
Thanks a lot for wonderful software! It is simple and usable. I have not used it for plagiarism detection, but also have find highly similar constructions from fifferent docs for making my life easier.
Thank you for providing this software!
It would also be helpful if you had the option of providing a template (e.g. the questions posed by the instructor) which would be excluded from the match list.
I have a version that allows for an exclude template, but I simply haven’t had time to post it here. Mostly, I need to write a help document because it reads a script file. It’s on my short list of things to do, but that list isn’t as short as it should be. Too much, too little time…
Hello Professor BloomField,
Thanks a lot for this wonderful tool, we have been using it in windows it was quite useful. We are badly in the need of Linux version, are there any work being done by you for this, when can we expect one?
thanks a lot for your time.
I have found this program very useful in comparing MATLAB computer code (simple text files). Is it possible to use the “Load from file” option with a wildcard to load all MATLAB files in a specified directory (for example, by specifying C:\Users\username\Desktop\temp\*.m)? Thanks.
Can you upload the command line version of the program (CopyFind)?
A couple years ago you said you were going to make copyfind machine-independent, and now there seems to be a distinction between copyfind and wcopyfind. Was the command-line version ever completed? If so, where is it available?
I continue to use (and recommend) your excellent work. My sincere thanks for your efforts.
I have notices two small issues which you might want to add to your list of things to do (if you ever find some free time).
1) If a document is named with non-standard letters (in my case, I used to name my files with the student name. I have one student named Phạm Thu Ngân. When I included the file (e.g., ‘Assignment 1 Phạm Thu Ngân.docx’) there was always an error (100% of the time). However, simply renaming the file fixes the problem 100% of the time. I’ve adapted around this by using student ID’s instead of names for the filenames so it does not impact me much any more.
2) This is a bonus item (not a bug, just a request): It would be nice if the output html would allow the deletion of rows. The idea is I might have 300 rows to process and if I could click an ‘X’ to remove the row from the report, it would make my processing (which can span days) that much easier.
Again, thanks making your work available. It really helps so many of us.
what a nice program. Could you make sure file handles are closed after running a comparison? I noticed that I cannot move a file after the comparison has been run, I always need to close the program first.
What I have observed during the test, it produced undoubtedly good results but some of features that needs little more enhancements. It would be more nice If the generated report is shown highlighted format. While I was searching comparison tools I another tool, This is how you can make the reporting section. http://plagiarismcheckerx.com/side-by-side-comparison
But overall, It has many options that I like the most.
several yrs back I mentioned how useful i’d found WCF when comparing two, supposedly “independent” Land Use Plans meant to guide Pittsboro, NC into the future. One plan had been provided as a “model” by the lobbyist/planner for the corporation that wants to convert a historic 3000 citizen town to a city of 50000+ residents. The next was produced by the planning board..they swore they did it all by their lonesome. But WCF revealed that 70% of the board’s plan was taken verbatim from the developer’s plan. Now push is coming to shove and our town Board of Commissioners is getting ready to vote of the 7000 acre project tonight. At least the analysis your tool provided has gotten SOME people thinking~!
I’ve used this software successfully in the past, but today when I try to enter a batch of approximately 400 html documents in the “New Document File” for comparison, it freezes up half way through the loading document stage (prior to document comparison). I don’t think the number of files is the problem, since I’ve compared large batches before. What else could cause this problem?
Have you switched to the newest version? If the old version worked and the new version fails, then perhaps I’ve broken something. I haven’t tested it hard since the latest update. If, for example, WCopyfind 4.4.1 works and WCopyfind 4.4.2 hangs, then I’d now where to look for trouble.
Error possibly found and fixed. Please try Version 4.1.3. — Lou
I am trying to troubleshoot an issue we’re encountering with this program version 4.1.2 for a Faculty Member. He is attempting to run a comparison of HTML files with about 300-400 files and each time he is trying to “Run” the program it is freezing up before completing the analysis. He stated that he has run it in the past with no issues.
Any thoughts on what may be causing this issue? We have already tried rebooting his PC and also tried to run the x86 and x64 versions. We’re running Windows 7 x64.
If version 4.1.2 freezes, please try version 4.1.1 and let me know what happens. The link for it is farther down the page. Knowing whether the bug is only in 4.1.2 will help me find it. In the meantime, it would be nice if the older version works.
Error possibly found and fixed. Please try Version 4.1.3. — Lou
I just tried comparing about 1000 small html documents and then about 8000 small text documents. I can’t get WCopyfind188.8.131.52 or WCopyfind.4.1.2 to hang on my machine (Windows 8.0). It would be nice to have thousands of large html files to compare, but I can’t think where to obtain such a collection quickly. I’ll search for every html document on my computers and feed them all at once to WCopyfind and see what happens.
I loaded 45,000 html files from my computers into WCopyfind.4.1.2 and got it to crash after loading about 30,000 of them. Now I have something to debug! I’m working on it.
I found a serious bug that caused WCopyfind.4.1.2 and WCopyfind184.108.40.206 to crash if they had trouble accessing a file during the loading process. On my machine, the crash resulted in a “WCopyfind has stopped working” message from Windows 8. On other machines, it might cause a hang instead.
I fixed the bug and enhance WCopyfind’s reporting of loading errors. It now describes the error that has occurred, rather than just reporting an error number. I have posted the fixed version as WCopyfind 4.1.3.
I found another serious bug that caused both WCopyfind and Copyfind to hang while loading imperfectly formatted html files. I have fixed this bug in version 4.1.4 of each software program.
hi. its wonderful.
I download it. but it is not compatible with unicode or utf8 languages.
من حسام هستم.
There are solutions to this problem?
Thanks a lot for this tool
It should handle unicode, but it has been a while since I worked on the code and I cannot remember all the details. The problem may be with the right-to-left ordering of some languages. I had to deal with those complications for translation work I did in my Coursera MOOC (How Things Work 1) and will try (eventually) to put them into Copyfind and WCopyfind. I just have too much too do and not enough time. I also want to fix up languages in which a single character is a word and there are no blanks between those characters.
Yeah! It Works! with utf8 Encoding. just, not with unicode.
Thank you my friend.
I Love Prophet Muhammad and you.
من حضرت محد و شما را دوست دارم.
Thanks for the great software!
I wanted to report a small bug. If the file paths/names of any compared files are too long, WCopyfind silently fails to generate the corresponding output HTML documents. I believe this happens when an output file path/name would be longer than MAX_PATH (260 characters). For example, trying comparing two with name/path lengths of ~150 characters each.
It would be great if the software would at least print a warning if this situation occurs. Other (more challenging) options include changing the output file name format or changing your Windows API calls to support long paths with the “\\?\” prefix.
Sorry about that flaw. I’ll keep it in mind and try to fix it when I get a chance to edit the code again. I appreciate your help in finding it.
Love your program!
I seem to have a problem with the number of files to compare: If the total number of old and new documents exceeds 660, the program stops with a “Cannot load file, used by another program.” – error message.
The content of the error message is wrong: If I remove this file from the list, the message appears for the next file. Thus it seems a size restriction either in file numbers or in used memory.
Windows 7, either the 32 or 64bit version.
Hmmm… there shouldn’t be any fundamental limit on the number of files read and I have used it for several thousand files at once. It may be a memory problem being misreported somehow (or a bug that I’ve introduced too recently to have tested with thousands of files). I don’t have time to chase the bug now, but I’ll look into it when I have a chance. In the meantime, the command-line version of the program (Copyfind) can load and unload files, so it’s possible to compare huge numbers of files in batches.
You are on the mark, it is a memory problem, not a “number of files” problem.
Just checked again: I included 100 “old” files and one “new” file to compare and run the check several times. (Just clicking again on the “run” button). **With each run, the memory allocated to the program increased according to the task manager.** At the 7th run, I got a “Cannot open file, in use by another program” error. Task manager showed around 51 MB memory allocated. The same experiment with 200 “old” files: program stopped at the 4th run with 54 MB allocated.
I found my way around by comparing in smaller batches, but you might want to keep it in mind.
Thanks again for the program!
Is there any update on the program regarding the bug mentioned above? I really love to use the program, but I currently have to run everything in small batches to avoid the program crashing. Btw. : I have the same problem using the older versions.
(Running Windows 7)
Please ignore my post above. This year’s problems I could trace to some corrupted files. (Not necessarily those displayed in the error message.)
Still love your program!
I’m thrilled to have found your program — it’ll help me do some research on borrowings between literary texts I’m working on. The only problem is, WCopyfind hasn’t worked once for me, even testing on duplicate short .txt files (97 words). Every time, I’ll load the two files, set the report file destination folder, and hit Run, with all the default settings intact, and after a few seconds, I’m told “WCopyfind has stopped working. A problem caused the program to stop working correctly. Please close the program.” (No error number.) It generates matches.html, matches.txt, and log.txt, but the former two are blank files, the latter reads:
Starting Report Files
Starting to Load and Hash-Code Documents
I’m running WCopyfind220.127.116.11 on Windows 7 Enterprise SP 1, 64-bit OS. Any thoughts on what’s going on?
It sounds like I broke something when I created version 4.1.5. It appears to be crashing during the document loading process, which is what I edited in going to version 4.1.5.
I suggest that you try the earlier versions (WCopyfind.4.1.4 or WCopyfind18.104.22.168), which appear farther down the WCopyfind page of this web site. Let me know of those work for you. Alternatively, try a different computer.
P.S. It’s also possible that you need the Microsoft Visual Studio 2013 redistributable library. It’s located at:
It’s on all my machines already, so I can’t test to see if not having it causes a delayed crash.
Downloaded MS Visual Studio (64-bit), installed and restarted, then tried v. 4.1.5, 4.1.4 (both 64- and 32- bit), 4.1.1. All crashing at the same point. So I’m guessing a problem with this particular computer…
I’ve tried running the program (4.1.5, 4.1.4, and 4.1.2, 64-bit) on another computer (Windows 7 Pro, SP1, 64-bit) after installing Microsoft Visual Studio 2013. The program crashes at exactly the same point, even when I try to run a comparison using two short, identical files. (If it helps any, it takes longer to crash when I try to run a comparison using two different, long files.) Any thoughts? If not, could you suggest another program that works, that I could use in the meantime?
Magic. You’ve no idea how long I’ve been looking for an app that would do this. Thanks.
I index my pdf’s (2500-3000, with Zotero) and then I add tags (currently nudging 600). You can imagine that looking through a book or long document to see if I’ve got a relevant tag takes a while and is inefficient. What I needed was a program which would look though my tag list and see if the document contained any of those words. It has been incredibly difficult finding one but now I have! The Brief report is exactly what I wanted and I note your suggestions in Interesting Things WCopyfind Can Do.
Suggestion: Once you get all your publishing done, could you please arrange for the program to remember its settings? Add a Save Settings button? This seems like such an obvious function! Having to set it up the way I need it each time is a small price to pay but well, you know …
Thanks again. Best wishes from the UK.
Hi, why am i getting this: “Error: This docx file cannot be read properly”?? how can it be solved? i used the software before with no problem doing exactly the same procedure. Hope you can help, thanks!