Jump to content

Wikipedia:Reference desk/Archives/Computing/2018 July 31

From Wikipedia, the free encyclopedia
Computing desk
< July 30 << Jun | July | Aug >> August 1 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


July 31[edit]

Is there a search engine with "ignore phrase" function?[edit]

In Google, cou can search for words and phrases, and with a minus sign also exclude them from the search. But is there a function in any search engine where you can *ignore* a phrase for your search? Example: Searching for "America" returns all pages including the word. Searching for "America" -"Captain America" excludes all pages mentioning "Captain America" from the results. But I want to search for "America" outside of the phrase "Captain America", and instead of excluding pages that also contain "Captain America", including them if they also have an "America" in a different context. Is there such a search engine? --KnightMove (talk) 12:12, 31 July 2018 (UTC)[reply]

What you are looking for is a search engine that uses regular expressions, sometimes called "regex" or "regexp". Goggle doesn't support them because they give you too much control over what results you see and reduces Googles' ability to decide for you what you should and should not see. Always remember this basic principle: When a large corporation such a Google, Facebook, twitter, etc. lets you use a website for free -- a website that costs a lot to maintain -- you aren't the customer. You are the product that is being sold to the real customers. --Guy Macon (talk) 14:38, 31 July 2018 (UTC)[reply]
There appears to be a less sneaky explanation: "The only possible way to make keyword searching efficient over hundreds of terabytes ... is to precompute an index of words... you can write arbitrary regexps that will gobble up near infinite amounts of CPU time and memory. For all these reasons it would be technical insanity for them to offer regex searching to the general public."
Still, this doesn't exclude the possibility of a search engine still supporting such an "ignore" function. --KnightMove (talk) 15:44, 31 July 2018 (UTC)[reply]
Regexes have almost nothing to do with this problem. They're not how search engines work, they're not a useful way to do exclusions. Andy Dingley (talk) 16:47, 31 July 2018 (UTC)[reply]
Not answering your question, but some general discussion. Search engines have their capabilities vary by companies (Proximity search (text)) and by the years[1]. Most Search engine indexing use some form of Inverted index (and I'm guessing) which probably could support your "ignore". Companies probably don't do it due to focusing on the 80/20 rule and "smart search" (e.g. Semantic similarity). StrayBolt (talk) 18:05, 31 July 2018 (UTC)[reply]
Google search focuses not on giving you what you want to know, but guessing what you expect to see. Advanced search doesn't feature into that business model. 78.0.231.46 (talk) 21:22, 1 August 2018 (UTC)[reply]

duckduckgo.com seems to support this. E.g. [2] 173.228.123.166 (talk) 21:22, 3 August 2018 (UTC)[reply]

Far as I can tell that's the same as Google (or Bing). It's including results which have willy but not the phrase "free willy". It won't include the blog post where Milly talkes about how she and Willy went to watch Free Willy. Nil Einne (talk) 04:54, 4 August 2018 (UTC)[reply]