Linguistic and numerical indexing in accessing pictorial databases

 

By

 

Mark A. Holmes

 

CSE 580.01 Winter 2016

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Here I discuss various ways of efficiently retrieving objects from pictorial databases.

 

Retrieval of objects from pictorial databases can be done efficiently and using a multilevel index and dense indexing, although secondary indexing and sparse indexing could be used for fuzzy logic searches. Disk access and saving storage space would also be a consideration, of course.

 

Linguistic or numerical indexing alone

Maybe linguistic indexing alone is not that efficient

 

Linguistic indexing might include names of people in the picture, what or who is depicted in the picture, the type of event, the location, the time, or the general mood of the picture. You would want to have an abstract, or concise description of the contents of the picture, if this is how you want to organize your images.

 

 It isn’t always possible to obtain this information and there might also be linguistic barriers; the information may not be in a language the searcher understands, for example, the abstract may be in English, but the searcher may not be an English speaker.

 

Typographical errors may also impede linguistic indexing.

 

 

 

 

 

Maybe numerical indexing alone is not that efficient, either

 Difficulty in remembering numbers

It would not be realistic to expect people to remember index numbers to use as search terms. People want to search by image attribute (“dog”, “Denali”, “smiling baby”, etc.), not by some number.

Limitations in memory would likely require reference-based indexing.

Update anomalies

There would be times when one would have to delete or modify an image stored in the database. Deleting an entry would require the creation of null values for the database entry if one doesn’t want to update the index number when the image to which the index refers is deleted. Abstracts would also have to be deleted when the image they describe is deleted, or edited, to avoid update anomalies in the form of inaccurate text or “orphaned” text that refers to nothing.

Multiple languages

As I have said earlier, not everybody in the world is a native English speaker. Database languages such as MongoDB can handle translations of abstracts, where abstracts exist. 

Efficiency of numerical indexing

You would probably want to use a table.

How this might work

Universal translators

Database languages such as MongoDB can handle translations of abstracts, at least potentially, as I said earlier.

 

 

 

 

 

 

 

 

 

Potential problems

Image content not described by abstracts

Just because an abstract exists doesn’t mean it completely and accurately describes the contents of the image. Poorly written abstracts, therefore, can be an impediment to accurate image searches.

Image content described by abstracts, but inadequately

In fact, shape-based indexing based on neither abstracts nor incremented integer-based numerical indexing and often using hashes is what is used right now.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Glossary

 

Dense indexing:

Dense indexing involves the use of a dense index, a file with pairs of keys and pointers for every record in the data file. Every key in this file is associated with a particular pointer to a record in the sorted data file. In clustered indices with duplicate keys, the dense index points to the first record with that key.

Sparse indexing:

Sparse indexing involves the use of a sparse index, a file with pairs of keys and pointers for every block in the data file. Every key in this file is associated with a particular pointer to the block in the sorted data file. In clustered indices with duplicate keys, the sparse index points to the lowest search key in each block.

 

Unique Indexing:

A unique index does not allow any duplicate values to be inserted into the table. The basic syntax is as follows:

CREATE UNIQUE INDEX index_nameon table_name (column_name);

Composite Indexes:

A composite index is an index on two or more columns of a table. The basic syntax is as follows:

CREATE INDEX index_nameon table_name (column1, column2);.

Implicit Indexes:

Implicit indexes are automatically created by the database server when an object is created. Indexes are automatically created for primary key constraints and unique constraints.

 

 

 

Hash table indexing:

In hash table indexing, the column value will be the key to the hash table and the actual value mapped to that key would be a pointer to the row data in the table. The value you would look up, say, “Denali”, would be the left side of the hash table entry and the right side would be an alphanumeric sequence that would refer to the table row where Denali, based on your photo’s abstract, is stored in memory. It would look something like “Denali => 0x27799″,  These keys are not stored in any particular order and can only be used for queries that check for equality (e.g., WHERE subject = ‘Denali’).

 

Binary index tree:

 

Also known as a B-tree or Fenwick tree, after New Zealander computer scientist Peter Fenwick, who first proposed it in 1994. They provide a method for calculation and manipulation of the prefix sums of a table of values (for example, a database index table). The binary tre calculates prefix sums and modifies the table at O(\log n) time, where n is the size of the table.

 

 

 

 

 

 

 

 

 

 

 

 

 

References

 

Castellano, Giovanna, Anna M Fanelli, and Maria A Torsello. “Incremental Indexing of Objects in Pictorial Databases .” Journal of Visual Languages and Sentient Systems 1 (2015): 23–28. Web. 20 Mar. 2016.

Faloutsos, Christos, “Indexing Multimedia Databases”, Advanced Course on Multimedia Databases In Perspective, University of Twente, the Netherlands, 1995, pp. 239-278.

Super, Boaz J. “Fast Retrieval of Isolated Visual Shapes.” Computer Vision and Image Understanding 85.1 (2002): 1–21. Web. 20 Mar. 2016.

N. K. Ratha, K. Karu, Shaoyun Chen and A. K. Jain, "A real-time matching system for large fingerprint databases," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 799-813, Aug 1996.

https://docs.oracle.com/cd/B12037_01/appdev.101/b10795/adfns_in.htm

 

 

 

 

 

 

 

 

 

 

 

 

 

Back To The My Writings Page

Back To The I-T Page