trešdiena, 2009. gada 28. oktobris

About NEO - Network Engine Optimization

Network engine optimization (NEO) is the process of improving the volume or quality of traffic to a web site from Network engines via "natural" or un-paid ("organic" or "algorithmic") search results as opposed to Network engine marketing (SEM) which deals with paid inclusion. Typically, the earlier (or higher) a site appears in the search results list, the more visitors it will receive from the Network engine. NEO may target different kinds of search, including image search, local search, video search and industry-specific vertical Network engines. This gives a web site web presence.


As an Internet marketing strategy, NEO considers how Network engines work and what people search for. Optimizing a website primarily involves editing its content and HTML and associated coding to both increase its relevance to specific keywords and to remove barriers to the indexing activities of Network engines.


The acronym "NEO" can also refer to "Network engine optimizers," a term adopted by an industry of consultants who carry out optimization projects on behalf of clients, and by employees who perform NEO services in-house. Network engine optimizers may offer NEO as a stand-alone service or as a part of a broader marketing campaign. Because effective NEO may require changes to the HTML source code of a site, NEO tactics may be incorporated into web site development and design. The term "Network engine friendly" may be used to describe web site designs, menus, content management systems, images, videos, shopping carts, and other elements that have been optimized for the purpose of Network engine exposure.


Another class of techniques, known as black hat NEO or spamdexing, use methods such as link farms, keyword stuffing and article spinning that degrade both the relevance of search results and the user-experience of Network engines. Network engines look for sites that employ these techniques in order to remove them from their indices.


Webmasters and content providers began optimizing sites for Network engines in the mid-1990s, as the first Network engines were cataloging the early Web. Initially, all a webmaster needed to do was submit the address of a page, or URL, to the various engines which would send a spider to "crawl" that page, extract links to other pages from it, and return information found on the page to be indexed.[1] The process involves a Network engine spider downloading a page and storing it on the Network engine's own server, where a second program, known as an indexer, extracts various information about the page, such as the words it contains and where these are located, as well as any weight for specific words, and all links the page contains, which are then placed into a scheduler for crawling at a later date.


Site owners started to recognize the value of having their sites highly ranked and visible in Network engine results, creating an opportunity for both white hat and black hat NEO practitioners. According to industry analyst Danny Sullivan, the phrase Network engine optimization probably came into use in 1997.[2]


Early versions of search algorithms relied on webmaster-provided information such as the keyword meta tag, or index files in engines like ALIWEB. Meta tags provide a guide to each page's content. But using meta data to index pages was found to be less than reliable because the webmaster's choice of keywords in the meta tag could potentially be an inaccurate representation of the site's actual content. Inaccurate, incomplete, and inconsistent data in meta tags could and did cause pages to rank for irrelevant searches.[3] Web content providers also manipulated a number of attributes within the HTML source of a page in an attempt to rank well in Network engines.[4]


By relying so much on factors such as keyword density which were exclusively within a webmaster's control, early Network engines suffered from abuse and ranking manipulation. To provide better results to their users, Network engines had to adapt to ensure their results pages showed the most relevant search results, rather than unrelated pages stuffed with numerous keywords by unscrupulous webmasters. Since the success and popularity of a Network engine is determined by its ability to produce the most relevant results to any given search, allowing those results to be false would turn users to find other search sources. Network engines responded by developing more complex ranking algorithms, taking into account additional factors that were more difficult for webmasters to manipulate.


Graduate students at Stanford University, Larry Page and Sergey Brin, developed "backrub," a Network engine that relied on a mathematical algorithm to rate the prominence of web pages. The number calculated by the algorithm, PageRank, is a function of the quantity and strength of inbound links.[5] PageRank estimates the likelihood that a given page will be reached by a web user who randomly surfs the web, and follows links from one page to another. In effect, this means that some links are stronger than others, as a higher PageRank page is more likely to be reached by the random surfer.


Page and Brin founded Google in 1998. Google attracted a loyal following among the growing number of Internet users, who liked its simple design.[6] Off-page factors (such as PageRank and hyperlink analysis) were considered as well as on-page factors (such as keyword frequency, meta tags, headings, links and site structure) to enable Google to avoid the kind of manipulation seen in Network engines that only considered on-page factors for their rankings. Although PageRank was more difficult to game, webmasters had already developed link building tools and schemes to influence the Inktomi Network engine, and these methods proved similarly applicable to gaming PageRank. Many sites focused on exchanging, buying, and selling links, often on a massive scale. Some of these schemes, or link farms, involved the creation of thousands of sites for the sole purpose of link spamming.[7]


By 2004, Network engines had incorporated a wide range of undisclosed factors in their ranking algorithms to reduce the impact of link manipulation. Google says it ranks sites using more than 200 different signals.[8] The three leading Network engines, Google, Yahoo and Microsoft's Bing, do not disclose the algorithms they use to rank pages. Notable NEOs, such as Rand Fishkin, Barry Schwartz, Aaron Wall and Jill Whalen, have studied different approaches to Network engine optimization, and have published their opinions in online forums and blogs.[9][10] NEO practitioners may also study patents held by various Network engines to gain insight into the algorithms.[11]


In 2005 Google began personalizing search results for each user. Depending on their history of previous searches, Google crafted results for logged in users.[12] In 2008 Bruce Clay, said that "ranking is dead" because of personalized search. It would become meaningless to discuss how a website ranked, because its rank would potentially be different for each user and each search.[13]


In 2007 Google announced a campaign against paid links that transfer PageRank.[14] In 2009 Google disclosed that they had taken measures to mitigate the effects of PageRank sculpting by use of the nofollow attribute on links.


By 1997 Network engines recognized that webmasters were making efforts to rank well in their Network engines, and that some webmasters were even manipulating their rankings in search results by stuffing pages with excessive or irrelevant keywords. Early Network engines, such as Infoseek, adjusted their algorithms in an effort to prevent webmasters from manipulating rankings.[16]


Due to the high marketing value of targeted search results, there is potential for an adversarial relationship between Network engines and NEOs. In 2005, an annual conference, AIRWeb, Adversarial Information Retrieval on the Web,[17] was created to discuss and minimize the damaging effects of aggressive web content providers.


NEO companies that employ overly aggressive techniques can get their client websites banned from the search results. In 2005, the Wall Street Journal reported on a company, Traffic Power, which allegedly used high-risk techniques and failed to disclose those risks to its clients.[18] Wired magazine reported that the same company sued blogger and NEO Aaron Wall for writing about the ban.[19] Google's Matt Cutts later confirmed that Google did in fact ban Traffic Power and some of its clients.[20]


Some Network engines have also reached out to the NEO industry, and are frequent sponsors and guests at NEO conferences, chats, and seminars. In fact, with the advent of paid inclusion, some Network engines now have a vested interest in the health of the optimization community. Major Network engines provide information and guidelines to help with site optimization.[21][22][23] Google has a Sitemaps program[24] to help webmasters learn if Google is having any problems indexing their website and also provides data on Google traffic to the website. Google guidelines are a list of suggested practices Google has provided as guidance to webmasters. Yahoo! Site Explorer provides a way for webmasters to submit URLs, determine how many pages are in the Yahoo! index and view link information.


The leading Network engines, such as Google and Yahoo!, use crawlers to find pages for their algorithmic search results. Pages that are linked from other Network engine indexed pages do not need to be submitted because they are found automatically. Some Network engines, notably Yahoo!, operate a paid submission service that guarantee crawling for either a set fee or cost per click.[26] Such programs usually guarantee inclusion in the database, but do not guarantee specific ranking within the search results.[27] Two major directories, the Yahoo Directory and the Open Directory Project both require manual submission and human editorial review.[28] Google offers Google Webmaster Tools, for which an XML Sitemap feed can be created and submitted for free to ensure that all pages are found, especially pages that aren't discoverable by automatically following links.[29]


Network engine crawlers may look at a number of different factors when crawling a site. Not every page is indexed by the Network engines. Distance of pages from the root directory of a site may also be a factor in whether or not pages get crawled.


To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Additionally, a page can be explicitly excluded from a Network engine's database by using a meta tag specific to robots. When a Network engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed, and will instruct the robot as to which pages are not to be crawled. As a Network engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam.


A variety of other methods are employed to get a webpage shown up in the searchs results. These include:
Cross linking between pages of the same website. Giving more links to main pages of the website, to increase PageRank used by Network engines.[32] Linking from other websites, including link farming and comment spam.
Writing content that includes frequently searched keyword phrase, so as to be relevant to a wide variety of search queries.[33] Adding relevant keywords to a web page meta tags, including keyword stuffing.
Additionally, writing tailored news content for the website, using relevant keywords, is often encouraged to compliment online marketing strategies. The more content on the page, the more website spiders see the website and the opportunities of receiving more incoming links to the website improve also.
URL normalization of web pages accessible via multiple urls, using the "canonical" meta tag.


NEO techniques can be classified into two broad categories: techniques that Network engines recommend as part of good design, and those techniques of which Network engines do not approve. The Network engines attempt to minimize the effect of the latter, among them spamdexing. Some industry commentators have classified these methods, and the practitioners who employ them, as either white hat NEO, or black hat NEO.[35] White hats tend to produce results that last a long time, whereas black hats anticipate that their sites may eventually be banned either temporarily or permanently once the Network engines discover what they are doing.[36]


An NEO technique is considered white hat if it conforms to the Network engines' guidelines and involves no deception. As the Network engine guidelines[21][22][23][37] are not written as a series of rules or commandments, this is an important distinction to note. White hat NEO is not just about following guidelines, but is about ensuring that the content a Network engine indexes and subsequently ranks is the same content a user will see. White hat advice is generally summed up as creating content for users, not for Network engines, and then making that content easily accessible to the spiders, rather than attempting to trick the algorithm from its intended purpose. White hat NEO is in many ways similar to web development that promotes accessibility,[38] although the two are not identical.


Black hat NEO attempts to improve rankings in ways that are disapproved of by the Network engines, or involve deception. One black hat technique uses text that is hidden, either as text colored similar to the background, in an invisible div, or positioned off screen. Another method gives a different page depending on whether the page is being requested by a human visitor or a Network engine, a technique known as cloaking.


Network engines may penalize sites they discover using black hat methods, either by reducing their rankings or eliminating their listings from their databases altogether. Such penalties can be applied either automatically by the Network engines' algorithms, or by a manual site review. Infamous examples are the February 2006 Google removal of both BMW Germany and Ricoh Germany for use of deceptive practices.[39] and the April 2006 removal of the PPC Agency BigMouthMedia.[40] All three companies, however, quickly apologized, fixed the offending pages, and were restored to Google's list.[41]


Many Web applications employ back-end systems that dynamically modify page content (both visible and meta-data) and are designed to increase page relevance to Network engines based upon how past visitors reached the original page. This dynamic Network engine optimization and tuning process can be (and has been) abused by criminals in the past. Exploitation of Web applications that dynamically alter themselves can be poisoned.


Gray hat techniques are those that are neither really white nor black hat. Some of these gray hat techniques may be argued either way. These techniques might have some risk associated with them. A very good example of such a technique is purchasing links.


While Google is against sale and purchase of links there are people who subscribe to online magazines, memberships and other resources for the purpose of getting a link back to their website.


Eye tracking studies have shown that searchers scan a search results page from top to bottom and left to right (for left to right languages), looking for a relevant result. Placement at or near the top of the rankings therefore increases the number of searchers who will visit a site.[43] However, more Network engine referrals does not guarantee more sales. NEO is not necessarily an appropriate strategy for every website, and other Internet marketing strategies can be much more effective, depending on the site operator's goals.[44] A successful Internet marketing campaign may drive organic traffic to web pages, but it also may involve the use of paid advertising on Network engines and other pages, building high quality web pages to engage and persuade, addressing technical issues that may keep Network engines from crawling and indexing those sites, setting up analytics programs to enable site owners to measure their successes, and improving a site's conversion rate.[45]


NEO may generate a return on investment. However, Network engines are not paid for organic search traffic, their algorithms change, and there are no guarantees of continued referrals. Due to this lack of guarantees and certainty, a business that relies heavily on Network engine traffic can suffer major losses if the Network engines stop sending visitors.[46] It is considered wise business practice for website operators to liberate themselves from dependence on Network engine traffic.[47] A top-ranked NEO blog NEOmoz.org[48] has suggested, "Search marketers, in a twist of irony, receive a very small share of their traffic from Network engines." Instead, their main sources of traffic are links from other websites.


Optimization techniques are highly tuned to the dominant Network engines in the target market. The Network engines' market shares vary from market to market, as does competition. In 2003, Danny Sullivan stated that Google represented about 75% of all searches.[50] In markets outside the United States, Google's share is often larger, and Google remains the dominant Network engine worldwide as of 2007.[51] As of 2006, Google had an 85-90% market share in Germany.[52] While there were hundreds of NEO firms in the US at that time, there were only about five in Germany.[52] As of June 2008, the marketshare of Google in the UK was close to 90% according to Hitwise.[53] That market share is achieved in a number of countries.[54]


As of 2009, there are only a few large markets where Google is not the leading Network engine. In most cases, when Google is not leading in a given market, it is lagging behind a local player. The most notable markets where this is the case are China, Japan, South Korea, Russia and Czech Republic where respectively Baidu, Yahoo! Japan, Naver, Yandex and Seznam are market leaders.


Successful Network optimization for international markets may require professional translation of web pages, registration of a domain name with a top level domain in the target market, and web hosting that provides a local IP address. Otherwise, the fundamental elements of Network optimization are essentially the same, regardless of language.