Children's Internet Protection Act (CIPA) Ruling by United States District Court For The Eastern District Of Pennsylvania
page 31 of 209 (14%)
page 31 of 209 (14%)
![]() | ![]() |
|
content. If a Web page or site is not linked by others, then
spidering will not discover that page or site. Furthermore, many larger Web sites contain instructions, through software, that prevent spiders from investigating that site, and therefore the contents of such sites also cannot be indexed using spidering technology. Because of the vast size and decentralized structure of the Web, no search engine or directory indexes all of the content on the publicly indexable Web. We credit current estimates that no more than 50% of the content currently on the publicly indexable Web has been indexed by all search engines and directories combined. No currently available method or combination of methods for collecting URLs can collect the addresses of all URLs on the Web. The portion of the Web that is not theoretically indexable through the use of "spidering" technology, because other Web pages do not link to it, is called the "Deep Web." Such sites or pages can still be made publicly accessible without being made publicly indexable by, for example, using individual or mass emailings (also known as "spam") to distribute the URL to potential readers or customers, or by using types of Web links that cannot be found by spiders but can be seen and used by readers. "Spamming" is a common method of distributing to potential customers links to sexually explicit content that is not indexable. Because the Web is decentralized, it is impossible to say exactly how large it is. A 2000 study estimated a total of 7.1 million unique Web sites, which at the Web's historical rate of growth, would have increased to 11 million unique sites as of |
|