10 January 2007

Rails 1.2.1 duplicate lines Simian Report

I have been evaluating the excellent and super fast Similarity Analyser by Red Hill Consulting to generate a duplicate lines reports from Rails 1.2.1 source code (excluding the tests) with a threshold of 5 lines.

May I insist on the threshold parameter: this parameter must be >=5, hence identical blocks of lines whose length official features page of Simian.

Update 23/01/2007: There was a bug in simian 2.2.12 which was considering lines containing the "end" keyword as duplicates. Hopefully, Simmon Harris fixed it in couple of days and released the version 2.2.13.

I have written a bit of code to generate the following html report from simian text report, I will publish as soon as it I have put into a gem and document it.

Here is the report generated on the 23th of January 2007:


Similarity Analyser 2.2.13 - http://www.redhillconsulting.com.au/products/simian/index.html

Copyright (c) 2003-07 RedHill Consulting Pty. Ltd. All rights reserved.

Simian is not free unless used solely for non-commercial or evaluation purposes.

{failOnDuplication=true, ignoreCharacterCase=true, ignoreCurlyBraces=true, ignoreIdentifierCase=true, ignoreModifiers=true, ignoreStringCase=true, threshold=5}

Loading (recursively) *[^t][^e][^s][^t].rb from /home/jeanmichel/ruby/projects/duplicate-lines-report/tmp

Found 5 duplicate lines in the following files:

   Between lines 745 and 751 in activerecord/lib/active_record/connection_adapters/frontbase_adapter.rb

   Between lines 90 and 96 in activerecord/lib/active_record/connection_adapters/abstract/schema_statements.rb

Found 5 duplicate lines in the following files:

   Between lines 688 and 692 in railties/lib/initializer.rb

   Between lines 37 and 41 in activesupport/lib/active_support/ordered_options.rb

Found 5 duplicate lines in the following files:

   Between lines 218 and 224 in actionmailer/lib/action_mailer/vendor/tmail/encode.rb

   Between lines 147 and 153 in actionmailer/lib/action_mailer/vendor/tmail/encode.rb

Found 5 duplicate lines in the following files:

   Between lines 418 and 424 in actionpack/lib/action_controller/assertions/selector_assertions.rb

   Between lines 293 and 301 in actionpack/lib/action_controller/assertions/selector_assertions.rb

Found 5 duplicate lines in the following files:

   Between lines 2 and 7 in railties/lib/commands/servers/webrick.rb

   Between lines 9 and 14 in railties/lib/commands/servers/mongrel.rb

Found 5 duplicate lines in the following files:

   Between lines 95 and 100 in activerecord/lib/active_record/deprecated_associations.rb

   Between lines 35 and 40 in activerecord/lib/active_record/deprecated_associations.rb

Found 5 duplicate lines in the following files:

   Between lines 43 and 47 in actionwebservice/examples/metaWeblog/apis/blogger_api.rb

   Between lines 34 and 38 in actionwebservice/examples/metaWeblog/apis/blogger_api.rb

Found 5 duplicate lines in the following files:

   Between lines 77 and 81 in actionwebservice/examples/googlesearch/delegated/google_search_service.rb

   Between lines 27 and 31 in actionwebservice/examples/googlesearch/direct/search_controller.rb

   Between lines 26 and 30 in actionwebservice/examples/googlesearch/autoloading/google_search_controller.rb

Found 5 duplicate lines in the following files:

   Between lines 178 and 183 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

   Between lines 127 and 132 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

Found 5 duplicate lines in the following files:

   Between lines 676 and 683 in railties/lib/initializer.rb

   Between lines 6 and 13 in activesupport/lib/active_support/ordered_options.rb

Found 5 duplicate lines in the following files:

   Between lines 679 and 683 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

   Between lines 678 and 682 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

Found 5 duplicate lines in the following files:

   Between lines 1375 and 1381 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

   Between lines 1359 and 1365 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

Found 5 duplicate lines in the following files:

   Between lines 68 and 72 in activerecord/lib/active_record/connection_adapters/openbase_adapter.rb

   Between lines 117 and 121 in activerecord/lib/active_record/connection_adapters/sqlite_adapter.rb

Found 5 duplicate lines in the following files:

   Between lines 25 and 31 in railties/lib/rails_generator.rb

   Between lines 29 and 33 in actionpack/lib/action_controller.rb

   Between lines 29 and 33 in activerecord/lib/active_record.rb

Found 5 duplicate lines in the following files:

   Between lines 202 and 213 in activerecord/lib/active_record/connection_adapters/mysql_adapter.rb

   Between lines 180 and 193 in activerecord/lib/active_record/connection_adapters/oracle_adapter.rb

Found 5 duplicate lines in the following files:

   Between lines 831 and 837 in railties/lib/commands/plugin.rb

   Between lines 809 and 815 in railties/lib/commands/plugin.rb

   Between lines 774 and 780 in railties/lib/commands/plugin.rb

   Between lines 603 and 609 in railties/lib/commands/plugin.rb

   Between lines 573 and 579 in railties/lib/commands/plugin.rb

   Between lines 546 and 552 in railties/lib/commands/plugin.rb

Found 5 duplicate lines in the following files:

   Between lines 5 and 16 in railties/lib/commands/server.rb

   Between lines 78 and 89 in railties/lib/commands/process/spawner.rb

Found 5 duplicate lines in the following files:

   Between lines 226 and 235 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

   Between lines 208 and 218 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

Found 5 duplicate lines in the following files:

   Between lines 2 and 7 in railties/lib/rails/version.rb

   Between lines 2 and 7 in actionwebservice/lib/action_web_service/version.rb

Found 5 duplicate lines in the following files:

   Between lines 12 and 18 in activerecord/lib/active_record/connection_adapters/openbase_adapter.rb

   Between lines 70 and 75 in activerecord/lib/active_record/connection_adapters/mysql_adapter.rb

Found 5 duplicate lines in the following files:

   Between lines 499 and 503 in activerecord/lib/active_record/connection_adapters/firebird_adapter.rb

   Between lines 478 and 482 in activerecord/lib/active_record/connection_adapters/firebird_adapter.rb

Found 5 duplicate lines in the following files:

   Between lines 317 and 321 in actionpack/lib/action_controller/vendor/html-scanner/html/selector.rb

   Between lines 296 and 300 in actionpack/lib/action_controller/vendor/html-scanner/html/selector.rb

   Between lines 277 and 281 in actionpack/lib/action_controller/vendor/html-scanner/html/selector.rb

Found 5 duplicate lines in the following files:

   Between lines 1 and 7 in activerecord/test/fixtures/migrations/1_people_have_last_names.rb

   Between lines 1 and 7 in activerecord/test/fixtures/migrations_with_duplicate/1_people_have_last_names.rb

   Between lines 1 and 7 in activerecord/test/fixtures/migrations_with_missing_versions/1_people_have_last_names.rb

Found 5 duplicate lines in the following files:

   Between lines 680 and 685 in activerecord/lib/active_record/vendor/simple.rb

   Between lines 627 and 631 in activerecord/lib/active_record/vendor/simple.rb

Found 6 duplicate lines in the following files:

   Between lines 16 and 25 in activesupport/lib/active_support/core_ext/time/conversions.rb

   Between lines 11 and 21 in activesupport/lib/active_support/core_ext/date/conversions.rb

Found 6 duplicate lines in the following files:

   Between lines 197 and 202 in activerecord/lib/active_record/connection_adapters/sqlserver_adapter.rb

   Between lines 134 and 139 in activerecord/lib/active_record/connection_adapters/sybase_adapter.rb

Found 6 duplicate lines in the following files:

   Between lines 284 and 291 in actionmailer/lib/action_mailer/vendor/tmail/header.rb

   Between lines 344 and 349 in actionmailer/lib/action_mailer/vendor/tmail/header.rb

Found 6 duplicate lines in the following files:

   Between lines 267 and 272 in activerecord/lib/active_record/validations.rb

   Between lines 825 and 830 in activerecord/lib/active_record/validations.rb

Found 6 duplicate lines in the following files:

   Between lines 277 and 285 in actionpack/lib/action_controller/vendor/html-scanner/html/selector.rb

   Between lines 296 and 304 in actionpack/lib/action_controller/vendor/html-scanner/html/selector.rb

Found 6 duplicate lines in the following files:

   Between lines 358 and 365 in actionmailer/lib/action_mailer/vendor/tmail/header.rb

   Between lines 498 and 505 in actionmailer/lib/action_mailer/vendor/tmail/header.rb

Found 6 duplicate lines in the following files:

   Between lines 175 and 185 in activerecord/lib/active_record/connection_adapters/oracle_adapter.rb

   Between lines 108 and 118 in activerecord/lib/active_record/connection_adapters/openbase_adapter.rb

Found 7 duplicate lines in the following files:

   Between lines 408 and 414 in actionwebservice/test/abstract_dispatcher.rb

   Between lines 398 and 404 in actionwebservice/test/abstract_dispatcher.rb

Found 7 duplicate lines in the following files:

   Between lines 518 and 530 in actionmailer/lib/action_mailer/vendor/tmail/header.rb

   Between lines 221 and 233 in actionmailer/lib/action_mailer/vendor/tmail/header.rb

Found 7 duplicate lines in the following files:

   Between lines 76 and 82 in actionmailer/lib/action_mailer/vendor/tmail/scanner_r.rb

   Between lines 68 and 74 in actionmailer/lib/action_mailer/vendor/tmail/scanner_r.rb

Found 7 duplicate lines in the following files:

   Between lines 164 and 176 in activerecord/lib/active_record/connection_adapters/openbase_adapter.rb

   Between lines 264 and 276 in activerecord/lib/active_record/connection_adapters/mysql_adapter.rb

Found 7 duplicate lines in the following files:

   Between lines 1 and 10 in activerecord/test/fixtures/migrations/3_innocent_jointable.rb

   Between lines 1 and 10 in activerecord/test/fixtures/migrations_with_duplicate/3_innocent_jointable.rb

   Between lines 1 and 10 in activerecord/test/fixtures/migrations_with_missing_versions/4_innocent_jointable.rb

Found 7 duplicate lines in the following files:

   Between lines 181 and 192 in actionpack/lib/action_controller/assertions/selector_assertions.rb

   Between lines 60 and 67 in actionpack/lib/action_controller/assertions/selector_assertions.rb

Found 7 duplicate lines in the following files:

   Between lines 204 and 213 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

   Between lines 148 and 157 in actionmailer/lib/action_mailer/vendor/tmail/parser.rb

Found 7 duplicate lines in the following files:

   Between lines 279 and 285 in activerecord/lib/active_record/connection_adapters/frontbase_adapter.rb

   Between lines 94 and 100 in activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb

Found 7 duplicate lines in the following files:

   Between lines 123 and 132 in actionpack/lib/action_controller/assertions/response_assertions.rb

   Between lines 69 and 78 in actionpack/lib/action_controller/assertions/routing_assertions.rb

Found 7 duplicate lines in the following files:

   Between lines 18 and 25 in railties/lib/rails_generator/generators/components/web_service/web_service_generator.rb

   Between lines 14 and 21 in railties/lib/rails_generator/generators/components/controller/controller_generator.rb

Found 7 duplicate lines in the following files:

   Between lines 506 and 512 in activerecord/lib/active_record/connection_adapters/sybase_adapter.rb

   Between lines 534 and 540 in activerecord/lib/active_record/connection_adapters/sqlserver_adapter.rb

Found 8 duplicate lines in the following files:

   Between lines 58 and 69 in railties/lib/rails_generator/generators/components/scaffold/scaffold_generator.rb

   Between lines 18 and 29 in railties/lib/rails_generator/generators/components/resource/resource_generator.rb

   Between lines 18 and 29 in railties/lib/rails_generator/generators/components/scaffold_resource/scaffold_resource_generator.rb

Found 8 duplicate lines in the following files:

   Between lines 37 and 49 in actionmailer/lib/action_mailer/vendor/tmail/stringio.rb

   Between lines 178 and 190 in actionmailer/lib/action_mailer/vendor/tmail/stringio.rb

Found 8 duplicate lines in the following files:

   Between lines 24 and 33 in activerecord/lib/active_record.rb

   Between lines 24 and 33 in actionpack/lib/action_controller.rb

Found 8 duplicate lines in the following files:

   Between lines 366 and 374 in activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb

   Between lines 219 and 227 in activerecord/lib/active_record/connection_adapters/sqlserver_adapter.rb

Found 8 duplicate lines in the following files:

   Between lines 5 and 18 in activesupport/lib/active_support/core_ext/module/attribute_accessors.rb

   Between lines 23 and 36 in activesupport/lib/active_support/core_ext/module/attribute_accessors.rb

Found 8 duplicate lines in the following files:

   Between lines 5 and 18 in activesupport/lib/active_support/core_ext/class/attribute_accessors.rb

   Between lines 23 and 36 in activesupport/lib/active_support/core_ext/class/attribute_accessors.rb

Found 9 duplicate lines in the following files:

   Between lines 6 and 14 in railties/lib/commands/process/spawner.rb

   Between lines 3 and 11 in railties/lib/commands/process/spinner.rb

   Between lines 4 and 12 in activesupport/lib/active_support/core_ext/kernel/daemonizing.rb

Found 10 duplicate lines in the following files:

   Between lines 77 and 88 in actionpack/lib/action_controller/assertions/selector_assertions.rb

   Between lines 195 and 210 in actionpack/lib/action_controller/assertions/selector_assertions.rb

Found 10 duplicate lines in the following files:

   Between lines 740 and 757 in actionmailer/lib/action_mailer/vendor/tmail/header.rb

   Between lines 836 and 853 in actionmailer/lib/action_mailer/vendor/tmail/header.rb

Found 10 duplicate lines in the following files:

   Between lines 458 and 469 in activerecord/lib/active_record/connection_adapters/frontbase_adapter.rb

   Between lines 482 and 493 in activerecord/lib/active_record/connection_adapters/frontbase_adapter.rb

Found 11 duplicate lines in the following files:

   Between lines 64 and 75 in actionwebservice/examples/googlesearch/delegated/google_search_service.rb

   Between lines 14 and 25 in actionwebservice/examples/googlesearch/direct/search_controller.rb

   Between lines 13 and 24 in actionwebservice/examples/googlesearch/autoloading/google_search_controller.rb

Found 12 duplicate lines in the following files:

   Between lines 40 and 52 in railties/lib/rails_generator/generators/components/scaffold/scaffold_generator.rb

   Between lines 2 and 14 in railties/lib/rails_generator/generators/components/resource/resource_generator.rb

   Between lines 2 and 14 in railties/lib/rails_generator/generators/components/scaffold_resource/scaffold_resource_generator.rb

Found 12 duplicate lines in the following files:

   Between lines 84 and 106 in actionwebservice/examples/googlesearch/delegated/google_search_service.rb

   Between lines 33 and 55 in actionwebservice/examples/googlesearch/autoloading/google_search_controller.rb

   Between lines 34 and 56 in actionwebservice/examples/googlesearch/direct/search_controller.rb

Found 19 duplicate lines in the following files:

   Between lines 53 and 81 in railties/lib/rails_generator/generators/components/scaffold_resource/scaffold_resource_generator.rb

   Between lines 42 and 69 in railties/lib/rails_generator/generators/components/resource/resource_generator.rb

Found 29 duplicate lines in the following files:

   Between lines 2 and 40 in railties/lib/rails_generator/generators/components/scaffold_resource/scaffold_resource_generator.rb

   Between lines 2 and 40 in railties/lib/rails_generator/generators/components/resource/resource_generator.rb

Found 37 duplicate lines in the following files:

   Between lines 3 and 56 in actionwebservice/examples/googlesearch/direct/search_controller.rb

   Between lines 2 and 55 in actionwebservice/examples/googlesearch/autoloading/google_search_controller.rb

Found 38 duplicate lines in the following files:

   Between lines 2 and 81 in activesupport/lib/active_support/binding_of_caller.rb

   Between lines 2 and 82 in railties/lib/binding_of_caller.rb

Found 41 duplicate lines in the following files:

   Between lines 1 and 49 in actionwebservice/examples/googlesearch/delegated/google_search_service.rb

   Between lines 1 and 49 in actionwebservice/examples/googlesearch/direct/google_search_api.rb

   Between lines 1 and 49 in actionwebservice/examples/googlesearch/autoloading/google_search_api.rb

Found 1074 duplicate lines in 121 blocks in 60 files

Processed a total of 23454 significant (46787 raw) lines in 375 files

Processing time: 1.347sec



Now the debate is opened about what to do with these reports !!!

3 comments:

Sam Aaron said...

I think that most of these duplicated lines are useless reports - a whole bunch of end lines, or a private statement surrounded by blank lines. I think the tool should be clever enough to ignore these things...

It would also be nice if the output showed what the duplicated lines were so you could quickly parse the output to find the real nasty cut'n'paste jobs if there are any.

Jean-Michel said...

To Samaaron.

Thanks for your feedback. You might have misunderstood how simian works: I have put a threshold of 4 lines (the minimum is 2!). This threshold + the fact Simian ignores blank lines won't bring back useless informations :-) even if I could have put an higher threshold.

I agree with you about the output. I am working on a html output which will link to the rails trac browser.

Anonymous said...

I've emailed you a patched version of simian to try out. If it works, I'll release it to production.

Regards,

Simon