[PPP] The long-awaited search function

Error message

Deprecated function: implode(): Passing glue string after array is deprecated. Swap the parameters in drupal_get_feeds() (line 394 of /var/www/pied-piper.ermarian.net/includes/common.inc).
AuthorTopic: [PPP] The long-awaited search function
Law Bringer
Member # 2984
Profile Homepage #0
It's dirty, it's buggy, but it's here.

http://pied-piper.ermarian.net/search/search.php

Several things to note:

1. You need to have Javascript enabled to use the form. I'm working on fixing this (ie, automatically switching to a non-js form if js is disabled). Also, depending on what browser you are using, you may get strange errors. Bear in mind this is my Javascript, and I suck at it even more than the language already does by itself. I've tested it with Firefox 1.5, Opera 8 and MSIE 6)

2. I have barely tested the thing. Don't be surprised if it explodes in your face or gives strange error messages. All the same, I'd appreciated being notified of that, so I can fix it.

3. Full-text and participant searching (ie. any search that queries individual posts rather than topics) is disabled, because the posts have never been indexed in a database. This may change in the future, some time.

[ Tuesday, February 21, 2006 12:44: Message edited by: Arancaytar the Grey ]

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00
Shock Trooper
Member # 6666
Profile #1
Neat.

As for #2 on your list, I'll contribute:
The box where you type the keywords doesn't appear for me until I change the "search by" criterion once.

Using Firefox 1.5.0.1 on XP SP2 in case you care.
Posts: 353 | Registered: Monday, January 9 2006 08:00
Law Bringer
Member # 2984
Profile Homepage #2
Yes, that should now be fixed. The query fields are all initially invisible, so now on startup I make the one visible that is currently being searched - or the title field by default. :)

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00
Law Bringer
Member # 6489
Profile Homepage #3
When searching by length, no matter which board I select, it only gives results from general.

[ Tuesday, February 21, 2006 15:07: Message edited by: Tyranicus ]

--------------------
"You're drinking liquor because you're thirsty? How nasty is your freaking water?" —Lazarus
Spiderweb Chat Room
Avernum RPSummariesOoCRoster
Shadow Vale - My site, home of the Spiderweb Chat Database, BoA Scenario Database, & the A1 Quest List, among other things.
Posts: 1556 | Registered: Sunday, November 20 2005 08:00
Law Bringer
Member # 2984
Profile Homepage #4
Ack. I'm tired. :rolleyes: That should be fixed as well now...

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00
Nuke and Pave
Member # 24
Profile Homepage #5
Nice idea. Are you planning to add full-text search?

As for bugs:
1. In profiles, birthdays are 1 day off and dates registered are completely random. They are fine in January's profiles.

2. Searching by topic starter gives following error: [message removed]
when searching by user name, instead of member number.

[ Tuesday, February 21, 2006 15:41: Message edited by: Zeviz ]

--------------------
Be careful with a word, as you would with a sword,
For it too has the power to kill.
However well placed word, unlike a well placed sword,
Can also have the power to heal.
Posts: 2649 | Registered: Wednesday, October 3 2001 07:00
Law Bringer
Member # 2984
Profile Homepage #6
1. No full-text search, until I find a way to get the posts into the database - and I find a database big enough. That can mean anything between two years and never.

2. Birthdays are off due to timezone trouble. I'll fix this soon. Registration dates were completely messed up this update; I'll correct them when I do the next update (they won't have changed after all).

3. You can't search by member name, and you're not supposed to. That's why the field says "member number". (oh, and could you take out that error message please, especially the file path? I don't mind people knowing my surname, but I appreciate when it's not all over Google. :) )

[ Tuesday, February 21, 2006 15:37: Message edited by: Arancaytar the Grey ]

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00
Nuke and Pave
Member # 24
Profile Homepage #7
Sorry, I didn't realize that path included your name. (If you are sensitive about it, you could use a random username to avoid these problems.)

About errors, for some reason your code that's supposed to change searchBy field doesn't work in IE 6.0.2800.1106.xpsp2... The search by field displayed doesn't change until I click Search button. I am not an expert on Javascript, but have you tried using onSelect, instead of onClick event?

[ Tuesday, February 21, 2006 16:01: Message edited by: Zeviz ]

--------------------
Be careful with a word, as you would with a sword,
For it too has the power to kill.
However well placed word, unlike a well placed sword,
Can also have the power to heal.
Posts: 2649 | Registered: Wednesday, October 3 2001 07:00
Master
Member # 4614
Profile Homepage #8
That's really cool. I tried searching for all the topics that got more than 300 posts and noticed some interesting things.
You have the What are you Listerning To? thread that got 1002 posts.You seem to have regained the Valley of Thunder.Thuryl's 5000 topic had 396 posts? :eek:Xian Skull is still lost to the void. :(

--------------------
-ben4808
Posts: 3360 | Registered: Friday, June 25 2004 07:00
Law Bringer
Member # 2984
Profile Homepage #9
The second one (VoT) was actually the most surprising to me. It was saved in PPP2.

However, there is no guarantee that it was the last version to be posted in. The last post in the topic mentions PPP2, so the whole topic might well have continued past that point, only that didn't get saved because PPP2 was done.

Also, I didn't save the music topic. I don't know who did (Kelandon?), but it was saved in PPP1.

Zeviz: No, apparently I used a separate onClick() event for each option. Could that be the problem?
I can change it to onSelect events, I guess...

Also, the host required me to provide a real name. I had no idea they were also going to use this name for the directory I store my files in. :P

[ Tuesday, February 21, 2006 20:49: Message edited by: Arancaytar the Grey ]

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00
Nuke and Pave
Member # 24
Profile Homepage #10
Aran, following modifications make the search criteria selection work fine on my computer. I've replaced all "onClick" events with "onChange" for the whole option block:

<select name='criterium' size='1' onChange='processChange(selectedIndex)'>
<option value='title' selected="selected">Title</option>
<option value='starter' >Starter</option>
<option value='participant' >Participant</option>
<option value='date'>Topic Date</option>
<option value='length'>Length</option>
<option value='text'>Full Text</option>
</select>

function processChange(index) {
document.all.lengthlayer.style.visibility='hidden';
document.all.memberlayer.style.visibility='hidden';
document.all.datelayer.style.visibility='hidden';
document.all.titlelayer.style.visibility='hidden';

switch (index) {
case 0:
document.all.titlelayer.style.visibility='visible';
break;
case 1:
document.all.memberlayer.style.visibility='visible';
break;
case 2:
document.all.memberlayer.style.visibility='visible';
break;
case 3:
document.all.datelayer.style.visibility='visible';
break;
case 4:
document.all.lengthlayer.style.visibility='visible';
break;
}
}


--------------------
Be careful with a word, as you would with a sword,
For it too has the power to kill.
However well placed word, unlike a well placed sword,
Can also have the power to heal.
Posts: 2649 | Registered: Wednesday, October 3 2001 07:00
Law Bringer
Member # 2984
Profile Homepage #11
Thanks! :) As I said, I royally suck at Javascript.

I'll try to put this in and see if it works.

--

On another note, I've started - as a test run - to index a part of the posts in the General archive. It's as bad as I feared - the table will, in total, be about 200-300 MB large, and my database is limited to 200 MB in size. For now, it is in fact possible to search by full text or participant, with the following restrictions:

1. Only topics in General
2. Only topics archived in PPP3 - older archives don't work.

Edit: I also seem to be missing a few posts, randomly. My index says the archived copy of Runescape ought to have 405 posts, but when entering the posts into the database, I can only find 397 of them.

Ah well. It works. In a very rough, incomplete sense.

[ Wednesday, February 22, 2006 20:59: Message edited by: Arancaytar the Grey ]

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00
Law Bringer
Member # 2984
Profile Homepage #12
I've put in your Javscript change, and I hope it works for you now. :)

Also, the main site is now extended a bit, including a page for browsing all threads in the forum, and the profile page from the endeavor.

Edit: My apologies. I spelt "archives" with an s when I made the rudimentary cms system, and I spelt it "archive" in the link. :rolleyes:

[ Thursday, February 23, 2006 21:54: Message edited by: Arancaytar the Grey ]

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00
Master
Member # 4614
Profile Homepage #13
Er, that last two links don't work, but otherwise the page looks pretty cool.

--------------------
-ben4808
Posts: 3360 | Registered: Friday, June 25 2004 07:00
Nuke and Pave
Member # 24
Profile Homepage #14
Very nice. :) (And thanks for including the fix I've mentioned.)

Looks like "fluffy turtles" have been mentioned in 26 threads. :rolleyes:

300MB for database size? That's a lot of spa... useful information. If you want to have a fun programming challenge, you could try to fit that into your database by compressing the post text. If you also compress the search phrases, you can still use SQL's comparisons to do the searches. You'd then have to retrieve links to plain text versions of the threads, stored same way you store them now. (Uncompressing all posts on the fly might be too CPU intensive, but I have no idea, since I've never tried anything like this.)

PS I am not an expert on web programming either, but there is a great reference that I use which usually has necessary information. (complete HTML and CSS reference, list of JavaScript events, etc.) You might already know about it, but here is the address anyway: http://www.blooberry.com/indexdot/html/

--------------------
Be careful with a word, as you would with a sword,
For it too has the power to kill.
However well placed word, unlike a well placed sword,
Can also have the power to heal.
Posts: 2649 | Registered: Wednesday, October 3 2001 07:00
Law Bringer
Member # 2984
Profile Homepage #15
Well, so far I'm not sure which way I want to go with the archive structure. Basically, I can keep it as it is now - "hybrid" in a way, or half database-driven (the topic list) and half flat-file based (the saved pages themselves). Or, as I once intended, I can go all the way and enter the posts into the database as well, and then generate the archived pages from the database instead of using the flat files.

The second has the advantage of adaptability and searchability, but it also uses more database space - which is usually more limited and more expensive than flat file space when it comes to hosting.

However, some things are redundant - take the ever-repeating signatures for example. I entered about 40,000 General posts in the database (the ones that got dumped last month), and that took 50 MB.

However, 20% of this space is taken up by signatures (which I stored in a separate field for each post).

Applying the rules of database normalization, I should make a new table that stores all *different* signatures used by each member, and then have the original table refer to this table. This table, as I have found out, has roughly 800 rows (or 2% of all posts), and uses a negligible amount of space - less that 100 KB.

In other words, of the estimated 250 MB, about 50 MB could be saved merely by normalizing the signature field. :)

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00
Nuke and Pave
Member # 24
Profile Homepage #16
I tried to continue via PM to avoid too much computer jargon in this thread, but your inbox is full.

Here are a few suggestions I have for your search feature:

1. I really like your hybrid structure, because it is equivalent to cashing every thread on the web server. Instead of having to query a bunch of tables and reconstruct the thread every time you want to present it, it allows you to just give a file off the web server. So even after you put everything into the database, it will still be more efficient to keep in the database links to text files on web server for presentation.

2. Another normalization you could do (if you haven't done so already) is storing mini-profiles that contain location, title and (possibly) displayed name combinations. (e.g. 1, 24, Berkeley, Blademaster, Zeviz; 2, 24, California, Nuke and Pave, Zeviz; etc.) That wouldn't save you as much as separating signatures, but it still adds up when counted over a span of hundreds of posts.

3. For storing posts, you could separate all meta-data (poster, date, thread, etc) into a separate table from actual content. This will let you enable participant, date, etc searches for all posts, even if storing their contents would take too much space.

4. Could you add COUNT search option for weird people who are curious whether the phrase "sanity jar" is used more often than "fluffy turtles". (Yes, I am extremely bored today.) :)

Thanks for making this archive.

--------------------
Be careful with a word, as you would with a sword,
For it too has the power to kill.
However well placed word, unlike a well placed sword,
Can also have the power to heal.
Posts: 2649 | Registered: Wednesday, October 3 2001 07:00
Law Bringer
Member # 2984
Profile Homepage #17
Sorry, I keep my inbox habitually full. I do have an email address though...

2. The member profiles are already stored separately - that's where I get the monthly statistics from! :)

3. That's a great idea!

4. I'm planning to add a lot of things in the search function right now; one of them is ordering the hits and another is browsing through several pages' worth of hits (so far, it can only display the first 300). Displaying the number of results wouldn't be difficult either.

Incidentally, the longest post stored so far is an article on abortion quoted by Ash Lael back in April last year. 44,000 characters...

--------------------
Encyclopaedia ErmarianaForum ArchivesForum StatisticsRSS [Topic / Forum]
My BlogPolarisI eat novels for breakfast.
Polaris is dead, long live Polaris.
Look on my works, ye mighty, and despair.
Posts: 8752 | Registered: Wednesday, May 14 2003 07:00