Jump to content

�Search user uploaded photos� feature not using an up-to-date index?


duolian

Recommended Posts

The <a href="http://www.photo.net/photodb/search">�Search user

uploaded photos�</a> feature is a potentially useful one, but I

think there is a problem with it. That feature appears not to respond

to changes made by users in the information attached to their images.

 

<p>I know this because I have some photos that I have changed the

caption on, using the �Edit Image Info� feature, to correct

misspellings or to add or delete information. When I use the �Search

user uploaded photos� to search for these photos using terms that

were added to the caption after the photo was first uploaded, it does

not �hit� them. However, if I search using terms that were part of

the caption but were subsequently removed from it, it

<i>does</i> �hit� them. I get these results even when the changes to

the captions were made many months ago.

 

<p>I assume that like most search engines, the �Search user uploaded

photos� feature is not actually searching in the database <i>per

se</i>, but in an index created from the database. With most search

engines, that index is periodically rebuilt, so that new contents get

included and old contents that have been deleted get removed. What

the results described above tell me is that either the index for

the �Search user uploaded photos� feature is not being rebuilt

properly, or else it is set up so that is <i>never</i> rebuilt, but

only uses the information that was attached to the photo when it was

uploaded.

 

<p>The �Search user uploaded photos� feature is a particularly

important one to have available at photo.net, because there is no

substitute for it. I generally don�t use the main photo.net search

engine because it�s clunky and inflexible and it is much easier and

more productive to use Google and set it to return only pages from

the www.photo.net domain. However, Google only indexes static

content pages at photo.net, so it does not get any of the uploaded

photos and the information (caption, etc.) accompanying them. Thus,

photo.net�s �Search user uploaded photos� feature is the only tool

available to search in the photo database.

 

<p>Can the �Search user uploaded photos� feature be changed so that

it uses a relatively current index that reflects changes made by

users in the information attached to their images?

Link to comment
Share on other sites

Brian:

 

<p>There are several photos in my portfolio that the search engine does <i>not</i> find when I search using words that are currently part of the caption, but that it <i>does</i> find using words that used to be in the caption but were taken out of the caption months ago. There are other oddities as well -- I have 38 photos in my portfolio, but when I search using my last name, the search engine tells me that there are <i>32</i> hits -- and it actually only displays <i>30</i> of them.

 

<p>Things are actually worse for some other folks -- like you. You have 27 photos in your photo.net portfolio. However, when I search using "Mottershead", the results page tells me there is exactly 1 hit ("Blue Table And Chairs").

 

<p>Also, when I search using terms that appear in the captions of some of your photos -- "Lully" ("Rising Storm, Lully"), "checkerboard" ("Sidewalk Checkerboard"), and "artifact" ("Artifact of the Old Economy"), for example -- the search engine does not find the photos. In other words, those captions seem not to have made it into the index.

 

<p>Or how about this: try to find the current POW, which is captioned "Rajasthani women", by searching for "Rajasthani". The "Search user uploaded photos" search engine doesn't find it.

 

<p>Something is clearly not working correctly with that search engine.

Link to comment
Share on other sites

  • 2 weeks later...

I discovered that the indexing of photos was disabled after around 100000 photos had been indexed. Since we now have around 350000, that is quite a while ago. I am concerned that indexing all 350000 will be (a) quite expensive in resources; and (b) not be very useful, since a lot of these photos are neither good nor of any great interest.

 

I am thinking of indexing only the following photos: (1) anything posted within the last 60 days; (2) any photo with at least 5 ratings averaging 12 (O+A total) or higher.

Link to comment
Share on other sites

  • 2 weeks later...

Brian, could you include number of comments >3 as a criterion? - as this would weed out most of the junk while keeping in photos that might get a low score because they're controversial.

 

Will you use the folder name in the indexing process and have you considered asking photographers to categorise their work, or provide key words? - because the titles often don't give much idea of the content and if the photo is obviously of a particular subject there may not seem to be any need to mention this in the caption. Asking for key words at this stage would be a good way of checking who is remaining active in the community.

 

Finally the "score" (first column) seems an obscure parameter and a fairly useless one.

 

I must say I'm absolutely delighted that you are re-jigging the search process because its present weakness is the main reason I have been spending more time on PhotoSIG recently.

 

What I would really like is a special category comprised of: family snaps, flowers, cars, pets and 9/11 - so I could EXCLUDE it.

 

Thanks for an otherwise great site.

Link to comment
Share on other sites

  • 4 months later...

Brian, indexing 400,000 images may be resource intensive on the first run, but after that the index will be updated only for new/changed images. The indexing engine will not need to rebuild the entire index every time. And instead of using a kind of fascist policy to exlude lower rated images, is it such a big deal to index all images and add an option to sort by rating?

 

I really do appreciate what you have done for this site, but I think you should make a higher priority to fix the search button.

 

Thank you.

Link to comment
Share on other sites

I think it's not just the Oracle performance required to (re)build the index that's an issue here.

 

The index must be stored somewhere (i.e., requires disk space), needs to be updated for every insert/edit/delete of a photo (i.e., requires Oracle power several thousand times each day) and will be used (duhh!) by the members (i.e., requires even more Oracle power each day).

 

One could argue that a working search index would increase the use of the gallery, resulting in more subscriptions, but I'm not too sure about that.

Link to comment
Share on other sites

>The index must be stored somewhere (i.e., requires disk space)

 

A good point, but even if you allow a generous 1Kb of text to be indexed per image that will add a small fraction to the disk space already used by that image. That would be 400 Mb for 400,000 images, but in practice the index will be smaller than that. I guess you can have a rough estimate by looking at how much space is taken by the current outdated index for 100,000 photos.

 

 

> , needs to be updated for every insert/edit/delete of a photo (i.e., > requires Oracle power several thousand times each day)

 

several thousand per day translates to a few times per minute - no sweat for Oracle engine.

 

> and will be used (duhh!) by the members (i.e., requires even more >

> Oracle power each day).

 

"featuring 430,853 images" is in the photo.net headline. I suggest adding "(but you can't search them)"</sarcasm>.

 

> One could argue that a working search index would increase the use of > the gallery, resulting in more subscriptions, but I'm not too sure > about that.

 

I agree. It's kinda disappointing when there is a great site like photo.net that doesn't have a good search engine.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...