Wednesday, February 28, 2007

project 1 statistics

Hi Everyone,
I have used following grading scheme for project.
 
Algorithm description and analysis - 15
Program                                       - 15
Results                                        - 10
-----------------------------------------------------------
Total                                            - 40
 
The statistics for project 1 are:
Mean                    - 35.78
Standard Deviation -   3.58
Max                      - 40.00
Min                       - 28.00
 
Here are some suggestions:
1. Some students have implemented vector space ranking as iterating over all the documents and finding whether the document contains query term or not. Instead inverted index can be used which gives the set of documents for query terms.
2. You can provide time and space complexity of the algorithm. It is not mandatory.
3. Only terms having "contents" as a field should be used from the index.
 
For project 2 you need to compare results of authority/hub, pagerank and vector space ranking. You also need to analyze the effect of changing the parameters of pagerank and authority/hub (like damping factor for pagerank, root set size for authority/hub ...).
 
Bhaumik

No comments: