KDD Cup 2004


  • September 3, 2004: The submission interface has reopened. You may submit predictions for both problems and your performance will be immediately displayed in two new results tables, one for the Protein Homology problem, and one for the Quantum Physics problem.

  • August 30, 2004: Additional information about the Protein Homology problem is now available from Rob Elber.

  • August 26, 2004: The winners have been announecd. See the Protein or Physics winners.

    Also, the slides from the presentation at the KDD Conference are available for download.

  • July 23, 2004: The results are up! View the Protein or Physics results now.

  • July 12, 2004: Today a participant pointed out that SLQ is computed incorrectly by PERF when some predictions have predicted value 1.0. A new version of PERF is available on the WWW site. Because of this, we are giving extra time for SLQ submissions. Submissions for SLQ are now due on Monday, July 26 at midnight PST. To be fair to those who might not be able to submit predictions for the revised SLQ-score, we will not include the SLQ-score when determining the overall winner on the Physics problem. The overall winner will be determined based on only the other three measures. However, there will still be an honorable mention for the SLQ-score using submissions made by the new July 26 deadline.

  • July 5, 2004: The submission interface is now open.

  • July 5, 2004: One of the participants pointed out that on some platforms, PERF might underestimate APR by a small amount when there are certain numbers of ties. Most likely, this will not affect your results. However, we put a new version that is more robust on the KDD-Cup Web page. Just in case you want to be absolutely certain, you might want to use the new code. Note again, that this only affects APR on the protein problem.

  • June 30, 2004: Added a short tutorial on the performance metrics.

  • June 18, 2004: FAQ updated to include discussion of the SLQ score used in the physics problem.

  • June 8, 2004: PERF version 5.10 was released that fixes the problem with average precision. YOU SHOULD DOWNLOAD AND USE THE NEW SOFTWARE. See the FAQ or software download page for details.

  • May 28, 2004: A bug was found in perf's average precision calculation. See the FAQ or software download page for details.

  • April 28, 2004: Tasks and dataset are available on this WWW-site.


The KDD-Cup 2004 knowledge discovery and data mining competition will be held in conjunction with the Tenth Annual ACM SIGKDD Conference. The challenge tasks are selected to be interesting to participants from both academia and industry. In particular, we encourage the participation of students.

This year's competition focuses on data-mining for a variety of performance criteria such as Accuracy, Squared Error, Cross Entropy, and ROC Area. As described on this WWW-site, there are two main tasks based on two datasets from the areas of bioinformatics and quantum physics.

We are looking forward to an interesting competition and encourage your participation.

Rich Caruana and Thorsten Joachims (KDD-Cup04 Co-Chairs)