Quantcast
Channel: SharePoint 2010 - Setup, Upgrade, Administration and Operations forum
Viewing all articles
Browse latest Browse all 13778

Access denied when crawling public internet site

$
0
0

I have a dev and a qa SharePoint 2010 server farm environment.   In my dev environment I have a search content source setup to crawl a series of public facing (non-sharepoint) web site.  The crawling works fine, no issues.

I recreated the exact same content source in my QA environment and when the crawl runs I get "Access Denied" errors for each of the web sites listed in the content sources.   I verified that I can access the sites by using a browser on the QA crawler server.

To try to debug the issue I loaded up fiddler on both the dev and qa environments.  Using a proxy configuration change I was able to watch the search crawler on the dev farm successfully crawl the public facing web sites.    With the same configuration on QA I saw that each web site returned a 401 access denied error.   

I looked into the http header of the request on dev and qa to see if there was any difference.   For some reason on the QA server the GET request being sent by the crawler includes an authorization header.  The dev request does not.

I have checked the content sources and crawl rules on both dev and qa; both servers are exactly the same.   Any ideas on why the crawler would be sending an invalid authentication header to a public facing website even when the site doesn't request any authentication information?   I thought an authorization header was only sent after the web server demanded it and had provided the types (ntlm, basic, etc...) that it supported back to the requestor.


Viewing all articles
Browse latest Browse all 13778

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>