Massive fullindex speedup
Updated 5 months, 3 weeks ago
| David Wolever | Reviewers | ||
| Basie-Dev | |||
| None | Basie SVN | ||
Speed up fullindex by committing all the changes at once. A fullindex now runs in ~1 minute on my laptop.
The testrunner is… Still running.
Review request changed
Updated 6 months, 2 weeks ago (February 20th, 2010, 8:27 a.m.)
-
The testrunner is… Still running.
Review request changed
Updated 6 months, 2 weeks ago (February 20th, 2010, 8:28 a.m.)
-
Speed up fullindex by committing all the changes at once.
Speed up fullindex by committing all the changes at once. A fullindex now runs in ~1 minute on my laptop.
Posted 6 months, 2 weeks ago (February 20th, 2010, 4:39 p.m.)
-
/trunk/apps/search/document.py (Diff revision 2) -
Your comment here is redundant. You should instead use intention-revealing variable names such as: should_commit_writer. Boolean variables named my_* are never a good idea.
-
/trunk/apps/search/document.py (Diff revision 2) -
Same as above.
-
/trunk/apps/search/management/commands/fullindex.py (Diff revision 2) -
Nit: trailing space.
Posted 6 months, 2 weeks ago (February 22nd, 2010, 10:15 p.m.)
-
/trunk/apps/search/document.py (Diff revision 2) -
By redundant I mean that the code says what you've said in the comments in fewer words. I.e. "if my_writer: writer.commit()" == "If we own this 'writer', then commit it" You've just rewritten the entire conditional block in a comment :) The second part "Assume someone else will be responsible for committing" is implied by the code. That part would be more useful if you said: "Class X is responsible for committing the writer". The problem with your variable name is that it seems to exist solely for the purpose of indicating that this class should commit the writer but yet the name does not indicate the purpose of the variable. This is why you felt the need to add a comment to clarify it. The name should "fully and accurately describe what the variable represents" (McConnell, Code Complete 2). What does a name like "my_anything" tell us about a boolean? What does it mean for a local boolean variable to be "mine"? Okay, perhaps it tells us that "anything" belongs to this class or scope or something. But why do we need to know that? Because the purpose of the my_anything variable is ambiguous, you'd need a comment to explain its existence. Meaningful variable names replace the need for excessive commenting. However, this code is simple enough that you could have gotten away with that variable name. But the 2 extra lines of comments, especially the first line are not needed.
Review request changed
Updated 6 months, 2 weeks ago (February 24th, 2010, 2:47 p.m.)
-
- added Diff r3
Fixing some tests, updating variable names.
Posted 6 months, 1 week ago (March 2nd, 2010, 11:12 a.m.)
The code looks good. Today I have to update the server's whoosh index on my computer so I will test it and give you the ship it after that.
Flushing...: 1558/5193 [======================== ] ETA: 00:01:23 Flushed run: [(StructFile(None), 149772)] Flushing...: 3012/5193 [=============================================== ] ETA: 00:00:41 Flushed run: [(StructFile(None), 149772), (StructFile(None), 142365)] Flushing...: 4570/5193 [======================================================================== ] ETA: 00:00:11 Flushed run: [(StructFile(None), 149772), (StructFile(None), 142365), (StructFile(None), 144381)] MailMessage: 5193/5193 [==================================================================================] Time: 00:01:40 WikiPage: 4/4 [===========================================================================================] Time: 00:00:01 Flushing...ation: 347/354 [============================================================================= ] ETA: 00:00:01 Flushed run: [(StructFile(None), 149772), (StructFile(None), 142365), (StructFile(None), 144381), (StructFile(None), 135231)] ChatConversation: 354/354 [===============================================================================] Time: 00:00:56 Ticket: 294/294 [=========================================================================================] Time: 00:00:11 Writing index... Flushed run: [(StructFile(None), 149772), (StructFile(None), 142365), (StructFile(None), 144381), (StructFile(None), 135231), (StructFile(None), 14340)] real 4m2.267s user 1m38.222s sys 0m4.516s
-
/trunk/apps/search/management/commands/fullindex.py (Diff revision 3) -
A \n is necessary here.
