February 1, 2015

Google for insecurity

Just about every study I have done relying on Google searches made me feel worse about the world. Huge numbers of people are racist and sexist; far too many children suffer from unreported abuse. But after studying the new data on sex, I actually feel better.

This data makes me feel less lonely. In my previous studies of Google data, I had found the viciousness that humans often hide. But this time around, I have seen our hidden insecurities. Men and women are united in this insecurity and confusion.

Google also gives us legitimate reasons to worry less than we do. Many of our deepest fears about how our sexual partners perceive us are unjustified. Alone, at their computers, with no incentive to lie, partners reveal themselves to be fairly nonsuperficial and forgiving. In fact, we are all so busy judging our own bodies that there is little energy left over to judge other people's.

-- Seth Stephens-Davidowitz

October 24, 2014

$GOOG knows you

In short, the Observer writes, Kurzweil believes that Google will soon "know the answer to your question before you have asked it. It will have read every email you've ever written, every document, every idle thought you've ever tapped into a search-engine box. It will know you better than your intimate partner does. Better, perhaps, than even yourself."

April 10, 2013

Search is more than web text, Google learns

Searches on traditional services, dominated by Google, declined 3 percent in the second half of last year after rising for years, according to comScore, and the number of searches per searcher declined 7 percent. In contrast, searches on topical sites, known as vertical search engines, climbed 8 percent.

While traditional searches increased again this year, other data reflects the threat to Google.

In the first quarter, spending on search ads fell 1 percent, a significant slowdown for Google, according to IgnitionOne, a digital marketing company. Last year, Google lost market share in search ads for the first time, according to eMarketer, falling to 72.8 percent from 74 percent.

This year, ad spending on traditional search engines is expected to grow more slowly than overall online ad spending, a reversal. Its growth significantly outpaced that of online ad spending until last year, eMarketer said.

Google is not watching from the sidelines. It is making more changes to search offerings, at a faster pace than it has in years.

Larry Page, its co-founder and chief executive, renamed the search division "knowledge." Google's mission, organizing the world's information, was too narrow. Now he wanted people to learn from Google.

Google now shows answers instead of just links if you search something like "March Madness," "weather" or even "my flight," in which case it can pull flight information from users' Gmail accounts.

Continue reading "Search is more than web text, Google learns" »

March 21, 2013

Automated search and automated commerce begat algorithmic schlock ?

Having found its golden meme, Solid Gold Bomb wrote a computer script to churn out hundreds of T-shirt designs riffing on the phrase -- "Keep Calm and Dream On" to "Keep Calm and Dance Off." In theory, Solid Gold Bomb could be selling billions of them, for they only become "real" once an order is made. It's the infinite monkey theorem, applied to products: with time, the algorithms would produce a T-shirt someone wants.

Amazon does not vet such items, and Solid Gold Bomb is too solid to care. The advent of 3D printing will create an explosion in such phantom products.

Books got there first: Amazon brims with algorithmically produced "literature." Philip M. Parker, a marketing professor, must be the most productive, erudite writer in history: Amazon lists him as author of more than 100,000 books. His secret? An algorithm to generate page-turners like "Webster's Estonian to English Crossword Puzzles" and "The 2007-2012 Outlook for Premoistened Towelettes and Baby Wipes in Greater China" ("The moist towelette is an essential part of the lunchbox, and with the new global economy, this volume is essential," reads its only review). Some of these books might be useful, but much of algorithmic literature exists for one reason: to swindle unsuspecting customers.

When the former Wired editor Chris Anderson wrote of "the long tail" -- the idea that, thanks to the Internet, companies can look beyond blockbusters and make money on obscure products -- he never warned us it would be so long and so ugly. Somehow, well-crafted niche products have surrendered to algorithmic schlock.


-- Evgeny Morozov

Continue reading "Automated search and automated commerce begat algorithmic schlock ?" »

January 29, 2013

Facebook search

This week, Facebook unveiled its search tool, which it calls graph search, a reference to the network of friends its users have created. The company's algorithms will filter search results for each person, ranking the friends and brands that it thinks a user would trust the most. At first, it will mine users' interests, photos, check-ins and "likes," but later it will search through other information, including status updates.

"While the usefulness of graph search increases as people share more about their favorite restaurants, music and other interests, the product doesn't hinge on this," a Facebook spokesman, Jonathan Thaw, said.

Nevertheless, the company engineers who created the tool -- former Google employees -- say that the project will not reach its full potential if Facebook data is "sparse," as they call it. But the company is confident people will share more data, be it the movies they watch, the dentists they trust or the meals that make their mouths water.

The things people declare on Facebook will be useful, when someone searches for those interests, Tom Stocky, one of the creators of Facebook search, said in an interview this week. Conversely, by liking more things, he said, people will become more useful in the eyes of their friends.

Technology with graph search Facebook bets on more sharing.

Continue reading "Facebook search" »

November 26, 2012

Google does (bisexual) /does (gay) not censor autocomplete

Google does/does not censor autocomplete

Krisztina Radosavljevic-Szilagyi, a Google spokeswoman, wrote: "The search queries that you see as part of autocomplete are a reflection of the search activity of all Web users."

Search engines have long provided clues to the topics people look up. But now sites like Google and Bing are showing the precise questions that are most frequently asked, giving everyone a chance to peer virtually over one another's shoulders at private curiosities. And they are revealing interesting patterns.

Frequently asked questions include: When will the world end? Is Neil Armstrong Muslim? Was George Washington gay?

The questions come from a feature that Google calls "autocomplete" and Microsoft calls "autosuggest." These anticipate what you are likely to ask based on questions that other people have asked. Simply type a question starting with a word like "is" or "was," and search engines will start filling in the rest.

People who study online behavior also say the autocomplete feature reveals broader patterns, including indications that the questions people ask of search engines often veer into the sensitive and politically incorrect.

Google does/does not censor autocomplete

The proliferation of the Autocomplete function on popular Web sites is a case in point. Nominally, all it does is complete your search query -- on YouTube, on Google, on Amazon -- before you've finished typing, using an algorithm to predict what you're most likely typing. A nifty feature -- but it, too, reinforces primness.

How so? Consider George Carlin's classic comedy routine "Seven Words You Can Never Say on Television." See how many of those words would autocomplete on your favorite Web site. In my case, YouTube would autocomplete none. Amazon almost none (it also hates "penis" and "vagina"). Of Carlin's seven words, Google would autocomplete only "piss."

Until recently, even the word "bisexual" wouldn't autocomplete at Google; it's only this past August that Google, after many complaints, began to autocomplete some, but not all, queries for that term. In 2010, the hacker magazine 2600 published a long blacklist of similar words. While I didn't verify all 400 of them on Google, a few that I did try -- like "swastika" and "Lolita" -- failed to autocomplete. Is Nabokov not trending in Mountain View? Alas, these algorithms are not particularly bright: unable to distinguish between Nabokov's novel and child pornography, they assume you want the latter.

Why won't tech companies let us freely use terms that already enjoy wide circulation and legitimacy? Do they fashion themselves as our new guardians? Are they too greedy to correct their algorithms' mistakes?


Continue reading "Google does (bisexual) /does (gay) not censor autocomplete" »

September 11, 2012

Amazon 2011

New features abound, of course, but they're the sort that university teachers and other white-collar workers know all too well: ways of doing more with less, by making workers (or customers) handle the routine chores that used to be done for them. Nowadays you can tag a given "product" for Amazon so that it knows what you think of a book; if you want, you can even study a tag cloud that lists and ranks the most popular customer tags, so that you'll do a better job of tagging for the company. You can enter a customer discussion or post a review.

And, of course, whenever you buy a book, you help Amazon not only gauge the book's popularity, but also identify the other books that you have bought as well. It's an efficient, thoroughly commercial counterpart to the old information system. The simple, elegant Web page that once showered discriminating customers with information now invites the consumer to provide information of every sort for Amazon to digest and profit from.

Continue reading "Amazon 2011" »

May 15, 2012

Just bing it

If a person searched for the movie "The Avengers," for example, Bing would annotate the results to indicate whether the searcher's Facebook friends had "liked" any of the Web pages found in that search previously on the social network.

Microsoft executives said that approach, on its own, did not have much success, partly because it cluttered the display of search results. "It was a good experiment, but it wasn't working in the way we expected," said Derrick Connell, a corporate vice president of Bing program management.

The new Bing has a much cleaner design that tucks all of the social search results away into a sidebar on the Bing search results pages, where they are now clearly distinct from the traditional Bing search results on the left side of the screen.

But the revamping also goes much further in the kind of information it picks up from Facebook.

For the search for "best hotels in Maui," for example, the results will also allow searchers to post questions about favorite hotels to the friends with Maui expertise that Bing has identified, without leaving the Bing search results page.

Continue reading "Just bing it" »

May 6, 2012

Use technology to stay private, online

If you do not want the content of your e-mail messages examined or analyzed at all, you may want to consider lesser-known free services like HushMail, RiseUp and Zoho, which promote no-snooping policies. Or register your own domain with an associated e-mail address through services like Hover or BlueHost, which cost $55 to $85 a year.

Another shrouding tactic is to use the search engine DuckDuckGo, which distinguishes itself with a "We do not track or bubble you!" policy. Bubbling is the filtering of search results based on your search history. (Bubbling also means you are less likely to see opposing points of view or be exposed to something fresh and new.)

Regardless of which search engine you use, security experts recommend that you turn on your browser's "private mode," usually found under Preferences, Tools or Settings. When this mode is activated, tracking cookies are deleted once you close your browser, which "essentially wipes clean your history," said Jeremiah Grossman, chief technology officer with WhiteHat Security, an online security consulting firm in Santa Clara, Calif.

Continue reading "Use technology to stay private, online" »

April 14, 2012

The purpose of Google

Is Google still a search company ?

A former Googler, James Whittaker, now working for Microsoft, wrote on a Microsoft blog: "The old Google made a fortune on ads because they had good content. It was like TV used to be: make the best show and you get the most ad revenue from commercials. The new Google seems more focused on the commercials themselves."

Continue reading "The purpose of Google" »

April 11, 2012

Search by voice

Nuance, meanwhile, has similarly ambitious plans for its health care business. In collaboration with I.B.M., the company is developing analytics to scour the medical notes that doctors dictate after they see patients. The idea is to search the text for common red flags -- like medicines that interact dangerously -- and automatically alert doctors, hopefully reducing problems and health care costs.

US Airways introduced Wally last summer, as part of a relocation of its offshore customer service call-in operations back to the United States. Nuance designed the system to anticipate callers' requests. Wally, for example, can automatically tell frequent-flier members their seat assignments or report whether they have received upgrades. It also converts people's speech to text, so that, should customers ask to speak a live operator, they don't have to repeat their original requests.

the lack of disclosure bothers Sherry Turkle, a professor of the social studies of science and technology at the Massachusetts Institute of Technology. As voice-enabled systems become more sophisticated, she says, they create the illusion that we are interacting with other people, rather than with machines. In the long term, she says, the systems' sleekness and ease of use could end up diminishing the value of slower, messier, real human connections. Reminding users that they are talking to a machine can make them more conscious of the superficiality of the exchange.

"We need to make a cultural decision," Professor Turkle says. "Either we want to alert people when they are talking to a machine, or we don't."

Soon, Mr. Sejnoha predicts, many other devices, not just televisions, will be taking voiced commands, and talking back. In Germany, people can already ask a Nuance-powered coffee maker -- marketed as "the first fully automatic machine that obeys" speech -- to make cappuccino. The machine, called the Jura Impressa Z7 One Touch Voice, speaks both English and German.

See also: Goldman, backing Nuance, smashed Dragon.

Continue reading "Search by voice" »

April 2, 2012

Google and Facebook, the new gatekeepers

Companies that make use of these algorithms must take this curative responsibility far more seriously than they have to date. They need to give us control over what we see -- making it clear when they are personalizing, and allowing us to shape and adjust our own filters. We citizens need to uphold our end, too -- developing the "filter literacy" needed to use these tools well and demanding content that broadens our horizons even when it's uncomfortable.

Then came the Internet, which made it possible to communicate with millions of people at little or no cost. Suddenly anyone with an Internet connection could share ideas with the whole world. A new era of democratized news media dawned.

You may have heard that story before -- maybe from the conservative blogger Glenn Reynolds (blogging is "technology undermining the gatekeepers") or the progressive blogger Markos Moulitsas (his book is called "Crashing the Gate"). It's a beautiful story about the revolutionary power of the medium, and as an early practitioner of online politics, I told it to describe what we did at But I'm increasingly convinced that we've got the ending wrong -- perhaps dangerously wrong. There is a new group of gatekeepers in town, and this time, they're not people, they're code.

Today's Internet giants -- Google, Facebook, Yahoo and Microsoft -- see the remarkable rise of available information as an opportunity. If they can provide services that sift though the data and supply us with the most personally relevant and appealing results, they'll get the most users and the most ad views. As a result, they're racing to offer personalized filters that show us the Internet that they think we want to see. These filters, in effect, control and limit the information that reaches our screens.

Continue reading "Google and Facebook, the new gatekeepers" »

December 26, 2011

Optimizing resume for keyword scanners

It's more than just single keywords that make you stand out from the crowd:

After all, a lot of other people are making sure that their resumes mimic the words mentioned in job descriptions as well. Instead, Lifehacker suggests that many companies now look for semantic matches, which are related terms like CPA, accounting, audits, and SEC to ensure that your resume represents real-world, useful, and related experience rather than just being stuffed with keywords. For an example of how this works, check out's Power Resume Search Test Drive.

-- CBS

November 17, 2011

Businesses are inherently about people and relationships: social networks and sharing to aid growth of online business apps ?

What's happening at Journal Communications is one small win for Google and its cloud computing challenge to Microsoft's lucrative Office division, maker of Microsoft Word and PowerPoint. But more than 4 1/2 years after Google Apps for business made its debut, the question remains how much of a dent Google is making in Microsoft's business.

Microsoft says Google's efforts are hardly noticeable. But Google executives say that more and bigger companies are signing up for the cloud service.

Possibly more important to Google is the way that Apps helps Google build social networks inside business. If successful, it would be a threat to Microsoft's biggest division and would create another inroad in its struggle with Facebook to dominate users' online lives.

"Businesses are inherently about people and relationships," said David Girouard, who runs Google's Apps business. Predictable things, like figuring out the supplies needed for manufacture, were "not the minimum to play," he said. "You need to have a social system, where a guy can introduce an idea about a new supplier, and he gets input from a lot of people quickly."

Continue reading "Businesses are inherently about people and relationships: social networks and sharing to aid growth of online business apps ?" »

July 12, 2011

Newsboys got klout

If you have a Facebook, Twitter or LinkedIn account, you are already being judged -- or will be soon. Companies with names like Klout, PeerIndex and Twitter Grader are in the process of scoring millions, eventually billions, of people on their level of influence -- or in the lingo, rating "influencers." Yet the companies are not simply looking at the number of followers or friends you've amassed. Rather, they are beginning to measure influence in more nuanced ways, and posting their judgments -- in the form of a score -- online.

To some, it's an inspiring tool -- one that's encouraging the democratization of influence. No longer must you be a celebrity, a politician or a media personality to be considered influential. Social scoring can also help build a personal brand. To critics, social scoring is a brave new technoworld, where your rating could help determine how well you are treated by everyone with whom you interact.

"Now you are being assigned a number in a very public way, whether you want it or not," said Mark W. Schaefer, an adjunct professor of marketing at Rutgers University and the executive director of Schaefer Marketing Solutions. "It's going to be publicly accessible to the people you date, the people you work for. It's fast becoming mainstream."

Influence scores typically range from 1 to 100. On Klout, the dominant player in this space, the average score is in the high teens. A score in the 40s suggests a strong, but niche, following. A 100, on the other hand, means you're Justin Bieber. On PeerIndex, the median score is 19. A perfect 100, the company says, is "god-like."

Don't just be found, by sought.

Continue reading "Newsboys got klout" »

May 16, 2011

Blekko /update

Google keeps making its search engine faster and easier -- I had to type only the letters "kidn" to get information on kidney stone treatments -- and the company notes that the billions of searches each day and its 66 percent market share prove that consumers find it useful. A long list of challengers who have fallen seem to prove that point -- Alta Vista, Yahoo, Ask Jeeves, Cuil, Kosmix, SearchMe and Wikisearch, to name only a few.

Still, Mr. Skrenta, who sold his first company to Netscape and then was a co-founder of Topix, which aggregates local news, is staking his reputation and his investors' money on a search engine called Mr. Skrenta pitched his investors with the notion that there is still money to be made in search because of the high price that the two big competitors get for search terms and advertising. If Blekko could get even a small part of that revenue, the investors would reap a healthy return on their money.

His idea is to concentrate the search. Only a relatively small number of the Web's total pages are visited -- in the tens of millions rather than in the hundreds of billions. In his view, it should be possible to simplify a search engine so it could satisfy a vast majority of searchers.

Blekko uses a search algorithm like Google's or Bing's but also gets humans, mostly volunteers, to identify the sites they know, trust and visit most often and to put those at the top of the search results.

"The best site may not have the best S.E.O.," Mr. Skrenta says.

It is a Wikipedia model -- or Huffington Post model -- applied to search. Some people apparently will work for no pay if they are convinced that their efforts will help or influence others. Experts who care enough about a topic edit the results. For instance, editors trawling the health results may give a higher ranking to the Web pages written by medical experts at the Cleveland Clinic and the Mayo Clinic than those generated for eHow by writers getting paid a few dollars per piece.

Using Blekko takes a little more effort. It works the way Google or Bing does, but if you want cleaner search results you must type in a slash mark and a category. The company calls them slashtags. Typing "/conservative" after "taxes," for instance, would give you sites written from the right; "/liberal" gives you the other side.

Blekko also sorts results for financial advice or sports. And it has some rather esoteric experts who have edited results for "gluten free" and "material safety data sheets," a category containing information on the properties of myriad substances. If someone tries to game the results, an expert presumably would block the efforts.

March 12, 2011

Own your reputation on line

The narrow focus on privacy as a form of control misses what really worries people on the Internet today. What people seem to want is not simply control over their privacy settings; they want control over their online reputations. But the idea that any of us can control our reputations is, of course, an unrealistic fantasy. The truth is we can't possibly control what others say or know or think about us in a world of Facebook and Google, nor can we realistically demand that others give us the deference and respect to which we think we're entitled. On the Internet, it turns out, we're not entitled to demand any particular respect at all, and if others don't have the empathy necessary to forgive our missteps, or the attention spans necessary to judge us in context, there's nothing we can do about it.

Continue reading "Own your reputation on line" »

March 3, 2011

Share your web history ?, and is banking on our willingness to take that next step toward taking our lives public: namely, by automatically tracking personal browsing histories for public viewing.

All that sharing can open up new and tricky fields of interplay in relationships. Mina Tsay, a communications professor at Boston University who studies the psychological and social effects of media, said that in her studies of Facebook she found that frequent users saw the world as significantly more public than less-frequent users did -- a source of misunderstanding familiar to many social media users.

¶ Privacy notwithstanding, Dr. Tsay said social media's evolution might create more-passive consumers of information: people too reliant on others to decide what's interesting, stylish or valuable.

¶ "In some ways, this might produce a society in which we end up conforming to buying the same products, seeing the same information, going on the same trip, depending on the same sources," she said.

Continue reading "Share your web history ?" »

November 30, 2010


Dan Duncan advocated the "don't get mad, get even" strategy for Yves' Naked Capitalism:

While you sort it out, always include several internal internal links to other posts. As long as you have internal links to your other work, then at least the scraped content will get you deep links to your back pages.

Other technological considerations: Instead of a simple HTAccess denial--ie simply denying access from the offending IP address-- do an HTAccess "re-write". By doing this, you don't block access...rather, you send the asshole "false" content of your choice. It could be a HUGE file of jibberish like "hy^&GBHBDFNLG#$&H%" ...or even better send them "The Best of DownSouth"! ["Please Yves of Naked Cap, we won't ever scrape your site again. Please, just-make-it-stop! We're begging you!"] [Of course, you are more than welcome to send them my commentary as well.]

Or, you could send the scraper into an infinite loop with something like this in HTAccess:

RewriteCond %{REMOTE_ADDR} ^123.123.123
RewriteRule ^(.*)$ http://domain.tld/feed

Replace the IP address with that of the scraper and replace the feed URL with the feed from the scraper's site. That would actually be amusing. If you do this, please let us know what happens.

Here are some other good blacklist options from a helpful site:

Also, beyond the Cease and Desist, you need to file DMCA Reports with the Search Engines.

And finally, since they are scraping to game Google go to Google:

November 10, 2010

Startpage, Scroogle private

Use a search engine that does not track users' activity. lets you search with Google without being tracked or seeing ads.

Startpage runs simultaneous searches on multiple engines anonymously.

November 1, 2010


Blekko, a search engine that will open to the public on Monday.

Rich Skrenta, Blekko's co-founder and chief executive, says that since Google started, the Web has been overrun by unhelpful sites full of links and keywords that push them to the top of Google's search results but offer little relevant information. Blekko aims to show search results from only useful, trustworthy sites.

"The goal is to clean up Web search and get all the spam out of it," Mr. Skrenta said.

Blekko's search engine scours three billion Web pages that it considers worthwhile, but it shows only the top results on any given topic. It calls its edited lists of Web sites slashtags. The engine also tries to weed out Web pages created by so-called content farms like Demand Media that determine popular Web search topics and then hire people at low pay to write articles on those topics for sites like

It is also drawing on a fruitful category of Web search -- vertical search engines that offer results on specific topics. Many companies assume that Google won the contest to search the entire Web, so they have focused on topical search. Bing from Microsoft has search pages dedicated to travel and entertainment, and Yelp is a popular choice for searching local businesses.

People who search for a topic in one of seven categories that Blekko considers to be polluted with spamlike search results -- health, recipes, autos, hotels, song lyrics, personal finance and colleges -- automatically see edited results.

Users can also search for results from one site ("iPad/Amazon," for instance, will search for iPads on, narrow searches by type ("June/people" shows people named June) or search by topic. "Climate change/conservative" shows results from right-leaning sites, and "Obama/humor" shows humor sites that mention the president. Blekko has made hundreds of these slashtags, and users can create their own and revise others.

See also: Cuil, Inktomi, Tehoma, ...

Continue reading "Blekko" »

October 18, 2010

Whenever people make decisions, there's money involved -- Charlene Li

"Google's made a lot of money helping people make decisions using search engines, but more and more people are turning to social outlets to make decisions," said Charlene Li, founder of Altimeter Group, a technology research and advisory firm. "And whenever people make decisions, there's money involved."

Continue reading "Whenever people make decisions, there's money involved -- Charlene Li" »

March 28, 2010

data mining vs privacy

Computer scientists and policy experts say that such seemingly innocuous bits of self-revelation can increasingly be collected and reassembled by computers to help create a picture of a person's identity, sometimes down to the Social Security number.

"Technology has rendered the conventional definition of personally identifiable information obsolete," said Maneesha Mithal, associate director of the Federal Trade Commission's privacy division. "You can find out who an individual is without it."

Carter Jernigan and Behram Mistree analyzed more than 4,000 Facebook profiles of students, including links to friends who said they were gay. The pair was able to predict, with 78 percent accuracy, whether a profile belonged to a gay male.

On Friday, Netflix said that it was shelving plans for a second contest -- bowing to privacy concerns raised by the F.T.C. and a private litigant. In 2008, a pair of researchers at the University of Texas showed that the customer data released for that first contest, despite being stripped of names and other direct identifying information, could often be "de-anonymized" by statistically analyzing an individual's distinctive pattern of movie ratings and recommendations.

pair of researchers that cracked Netflix's anonymous database: Vitaly Shmatikov, an associate professor of computer science at the University of Texas, and Arvind Narayanan, now a researcher at Stanford University.

By examining correlations between various online accounts, the scientists showed that they could identify more than 30 percent of the users of both Twitter, the microblogging service, and Flickr, an online photo-sharing service, even though the accounts had been stripped of identifying information like account names and e-mail addresses.

Continue reading "data mining vs privacy" »

March 27, 2010

renrou sousuo yinqing (Human-flesh search engines)

Human-flesh search engines -- renrou sousuo yinqing -- have become a Chinese phenomenon: they are a form of online vigilante justice in which Internet users hunt down and punish people who have attracted their wrath.

The popular meaning is now not just a search by humans but also a search for humans, initially performed online but intended to cause real-world consequences. Searches have been directed against all kinds of people, including cheating spouses, corrupt government officials, amateur pornography makers, Chinese citizens who are perceived as unpatriotic, journalists who urge a moderate stance on Tibet and rich people who try to game the Chinese system. Human-flesh searches highlight what people are willing to fight for: the political issues, polarizing events and contested moral standards that are the fault lines of contemporary China.

Posted to Asia, search, words.

Continue reading "renrou sousuo yinqing (Human-flesh search engines)" »

March 24, 2010

Plagiarism Software Spared The Times an Embarrassment ?

Could Plagiarism Software Have Spared The Times an Embarrassment?
Craig Silverman, the editor of Regret the Error, a Web site that reports on accuracy and honesty in the press, says most plagiarism by journalists is caught only when someone complains. That's what happened last month at The Times, which had to endure the mortifying experience of having a bitter cross-town rival, The Wall Street Journal, point out the theft of half a dozen passages from one of its news articles.

Silverman thinks The Times could have avoided the embarrassment with computer software designed to ferret out plagiarism by comparing news articles about to be published with millions of published works on the Web and in various databases. Such software is in wide use in the academic world, but has few takers in the news industry. Silverman said it makes many journalists uncomfortable because it seems to assume guilt.

Most journalists who commit plagiarism, like Zachery Kouwe at The Times, say they did not intend to take the words of others. "If it really is an accident," Silverman argues, "let's catch the accident before it gets into print." You can read more of Silverman's case.

November 23, 2009

Fx: Forex trading profits

Do Forex trading profits go to traders, brokers, software vendors ?
A review of the Foreign Exchange advertizing economy.

September 18, 2009

tineye image duplicate search

tineye images search, updates.

Given an image file, find similar or identical images on the web.

July 2, 2009

Facebook using Bing

Bing makes inroads in the search market.

June 25, 2009

Twitter search One Riot

A number of search start-ups have appeared recently that differentiate their offerings from older search engines' by playing up their specialized focus on the real-time Web. For example, OneRiot, based in Boulder, Colo., covers Twitter among other social media, but it has an intriguing means of reducing Twitter spam: it does not index the text in tweets -- it plucks only the links, reasoning that the videos, news stories and blog posts that are being shared are what others will be most interested in.

OneRiot follows the link, checks for spam by comparing the content of the page with the content of the tweet, and then uses its own algorithms to figure out where the link should go in its always-changing index of "hot" items.

Strictly speaking, this is not real-time processing. But checking links before adding them to the index seems to be time well spent.

Tobias Peggs, general manager at OneRiot, said his company could process, check and index a link within 37 seconds. When asked why he bothered to measure the seconds if it took 20 or more minutes just to receive searchable tweets from Twitter, he explained that the delays at Twitter's search site did not affect his company's search service, which receives the data stream at the same time Twitter's own search engine does. Because one venture capital firm, Spark Capital, has invested in both OneRiot and Twitter, OneRiot has "access to Twitter data that other third parties don't," Mr. Peggs said.

Continue reading "Twitter search One Riot" »

March 16, 2009

Kosmix web search, automated about page builder

Kosmix automatically builds web pages for a given search target.
A bigger threat to than to Google. By default, does give a good what's hot zeitgeist of the Internet.

Continue reading "Kosmix web search, automated about page builder" »

March 10, 2009

Wolfram Alpha Search: computation over lookup

Wolfram Alpha searching will be very logical.

Update 2009 May 11:

In its current state, there are many queries that WolframAlpha cannot answer, either because it does not understand the question or because it does not have the requisite data. For instance, it is stumped by queries like "obesity rate," "housing prices New York" or "unemployment San Francisco" (but it will answer "unemployment San Francisco County").

Continue reading "Wolfram Alpha Search: computation over lookup" »

January 6, 2009

Good wage for reading Facebook ads: 90 dollars per hour

Facebook's targeted advertising throws up three very similar adjacent display placements onto one page. Is this evidence of a price searching algorithm, to ween wages too low to be tempting, or too high to be believed (not to mention too high to be actually available) ?


$92/hr ?

$75/hr ?

$88/hr ?

December 17, 2008

Google upgrades YouTube

Were the Channels and Subscribe features insufficient ?

Google struck back yesterday, launching two new important YouTube features. The first is YouTube's new high-definition option, which switches to wide screen and features much higher resolution than the usual fare. Since most videos are not HD-formatted, YouTube has set up an "HD Videos Area," where users can search for the highest-quality films the site has to offer. Low-resolution video has been one of the issues keeping advertisers from throwing money at the site, and this may help turn things around.

YouTube's second initiative tackles the site's maddening lack of navigability. Even though companies like CBS and MGM have signed deals to post feature-length shows on YouTube, no one can find them, thanks to the peculiar architecture of the Web site. Now, YouTube has started collecting movies, music, and news on three separate landing pages. The news page will offer video broadcasts of breaking news, and the music and movies pages will showcase the most popular songs and feature-length films, broken down by category. Users will still find themselves lost in YouTube's architecture most of the time, but at least it's a start.

-- Feeling Lucky

November 16, 2008

Google, know thyself and do not share certificates.


July 28, 2008

Cuil search


Former Google employees are unveiling a search engine that they promise will be more comprehensive than Google's and hope will give its users more relevant results.

Technology: Former Employees of Google Prepare Rival Search Engine
Published: July 28, 2008
See also, Vivisimo, Snap, Mahalo and Powerset.

Update 2010 Nov 1:

Blekko has raised $24 million in venture capital from prominent investors like Marc Andreessen, Ron Conway and U.S. Venture Partners. It plans to sell Google-like search ads associated with keywords and slashtags.

Some start-ups that have taken on search have been folded into the big companies, like Powerset, which Microsoft bought in 2008. Others, like Cuil, a search engine started by former Google engineers in 2008, were flops. Blekko's slashtags could be subject to spam since anyone can edit them, but Blekko says it will avoid that with an editor and Wikipedia-style policing by users.

A New Search Engine, Where Less Is More
Published: October 31, 2010
Blekko aims to show search results from only trustworthy sites, weeding out sites filled with little relevant information

June 6, 2007

BuzzFeed's what's hot

BuzzFeed shows what is infatuatingly hot.

Via BuzzFeed

August 9, 2006

Yahoo Publisher Network, RSS widgets

Yahoo Publisher Network (YPN) takes on
Google Adsense.

Also Y! now offers fully featured widgets via RSS fees.

What to do with RSS feeds: aggregate into Feed Groups !
See NetVibes.

July 15, 2006

Vivisimo, internet search v5

New Vivisimo, better crawling and grouping.

Prviously: Vivisimo clusty headline news

July 14, 2006

Shakespeare @ google: soliloquy search

Comedy, tradedy, romance. The elegant design
of Shakespeare's life work.

Not yet: Browse by or search for work's name Hamlet, character
name Ophelia or search for content text nunnery.

June 26, 2006

NY Real Estate maps: Shark Bites

Curbed and Property Shark team for NY Property Map theme of the week at Shark Bites.

May 21, 2006

Snap job

Snap Job Search.
Best use of incremental search partitioning and refinement
of multi-faceted search and browsing.

Snap journal.

Battelle comments.

May 5, 2006

Chicago personal injury lawyer or New York lasik

Chicago personal injury lawyer or New York lasik laser eye surgery are
valuable search words.

So hire a Chicago personal injury lawyer if your New York lasik fails.

Continue reading "Chicago personal injury lawyer or New York lasik " »

December 18, 2005

Google Earth

For GPS, GIS map junkies: google earth for Mac OS X.

November 6, 2005


Search for search for stylized facts at the plazoo.

October 27, 2005

Google News Report USA Score

Fetch headlines from Google News on a schedule, then rank
headlines by factors:

* appearance day and time,
* prominence on the google news page,
* number of appearances,
* others;

weighted to estimate referer traffic these links bring to their

Listed are the top scoring stories in recent time periods, followed
by a ranking of sources. More detailed reports are linked-to at the
bottom of each table.


July 10, 2005

Mozbot, France's prettier Google

Search for stylized facts or for Coruscation at Mozbot, France's prettier Google.

June 26, 2005

Google tutor

Google Tutor and Google Guide's advanced operators reference are full
of search optimization advice for Google's end users.

June 19, 2005

Google Scholar

Research citations

Google Scholar

Galegroups's review.

May 7, 2005

google personal seach history

Google Blog-o-scoped reviews Google's new search history retention
and recall.

April 10, 2005

Craig's List and Google Maps

Craig's List and Google Maps merge, and the result is good.

See for rent and for sale listings plotted on a map,
pins colourized to show availability of pictures,
drill down the matches to a feature set or price band.

January 8, 2005

Clusty headlines

Clusty headlines are groupable into clusters
by reader-specified criterion.

Clusty shows Stylized Facts as in these clusters:

Market (30)
⇨Growth (17)
⇨Statistical, Empirical (10)
⇨Interest Rates (9)
⇨Bank, Research (7)
⇨Volatility, Modeling (7)
⇨Behavior, Generate (6)
⇨Generate The Stylized Facts (5)
⇨Economic Blog (3)

Continue reading "Clusty headlines" »

November 20, 2004 searches refereed publications searches refereed publications.
Sample search mortgage prepayment modlleing.