Trying to assemble a custom search engine (CSE) focused on historical information about real estate properties, but with certain web sites filtered out. Google provides this CSE service for free with ads.

Sites to be filtered include zillow.comhomes.com, and similar web products containing property appraisals and other information.

Nothing is wrong with these sites but there are so many of them and they get in the way of what I’m really looking for, which is news reports and anecdotal information about the specific address. It is a type of search engine I imagine might be useful to biographers, researchers, and genealogists, among others.

For example I am researching one Max Blumfeld, who lived at 188 Orchard Street New York City in the early 1900s. Searching the open Internet for that street address returns all the usual zillow-type real estate type sites, with Mr. Blumfeld’s links buried several pages deep.

After configuring the CSE to exclude zillow and several other such sites I find that they are successfully not included in the search results.

So far so good.

The problem is that a great number of *other* web sites are also being excluded even though they are not in the list of sites to be skipped. The NY Times site is excluded, for example, though there’s no way I would want that site left out.

A normal search for:

        “188 orchard street” blumfeld

returns a handful of NY Times stories, as seen by clicking the link above.

A search on my CSE for the exact same query returns nothing. I cannot seem to bookmark search results on my CSE so you have to copy and paste the above query to see what I mean. (Note the use of quotes in the above query.) My CSE is here:

http://sorabji.com/b/building_search/

This CSE *does* find other sites, just not very many. A search for “188 Orchard Street” on the CSE returns 238 results.

In the “Sites to search” I have not included any sites, with “Search the entire web but emphasize included sites” selected.

I posted this to the CSE support forum but got no reply. I am putting it out here in case anyone else has a similar problem and knows what the dealio is.