bee4/robots.txt
最新稳定版本:v2.0.3
Composer 安装命令:
composer require bee4/robots.txt
包简介
Robots.txt parser and matcher
README 文档
README
This library allow to parse a Robots.txt file and then check for URL status according to defined rules. It follow the rules defined in the RFC draft visible here: http://www.robotstxt.org/norobots-rfc.txt
Installing
This project can be installed using Composer. Add the following to your composer.json:
{
"require": {
"bee4/robots.txt": "~2.0"
}
}
or run this command:
composer require bee4/robots.txt:~2.0
Usage
<?php use Bee4\RobotsTxt\ContentFactory; use Bee4\RobotsTxt\Parser; // Extract content from URL $content = ContentFactory::build("https://httpbin.org/robots.txt"); // or directly from robots.txt content $content = new Content(" User-agent: * Allow: / User-agent: google-bot Disallow: /forbidden-directory "); // Then you must parse the content $rules = Parser::parse($content); //or with a reusable Parser $parser = new Parser(); $rules = $parser->analyze($content); //Content can also be parsed directly as string $rules = Parser::parse('User-Agent: Bing Disallow: /downloads'); // You can use the match method to check if an url is allowed for a give user-agent... $rules->match('Google-Bot v01', '/an-awesome-url'); // true $rules->match('google-bot v01', '/forbidden-directory'); // false // ...or get the applicable rule for a user-agent and match $rule = $rules->get('*'); $result = $rule->match('/'); // true $result = $rule->match('/forbidden-directory'); // true
统计信息
- 总下载量: 9.33k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 12
- 点击次数: 1
- 依赖项目数: 1
- 推荐数: 0
其他信息
- 授权协议: Apache-2.0
- 更新时间: 2015-03-13