|
[md]昨晚上还在看的漫画,
今天起床就被下架了,
太惨了
用python写了个爬虫,
能通过API看到的漫画,
就能爬
```py
import requests
import json
import os, re
BASEDIR = os.getcwd()
comicId = 7474 # 填漫画的ID
headers = {'Referer':'http://images.dmzj.com/'}
comicInfo = requests.get("http://v2.api.dmzj.com/comic/%d.json"%comicId).content
comicInfo = json.loads(comicInfo)
chapters = comicInfo['chapters'][0]['data']
for chapter in chapters:
cId = chapter['chapter_id']
cTitle = chapter['chapter_title']
cOrder = chapter['chapter_order']
cdir = BASEDIR + '/%s'%cTitle
if not os.path.exists(cdir):
os.makedirs(cdir)
print("%s create folder %s"%(cTitle, cdir))
chapterInfo = requests.get("http://v2.api.dmzj.com/chapter/%d/%d.json"%(comicId, cId)).content
chapterInfo = json.loads(chapterInfo)
for page in chapterInfo['page_url']:
match = re.search('(\d+).jpg', page)
print(match.group(1))
res = requests.get(page, headers=headers)
pic = open(cdir + '/%s.jpg'%match.group(1), 'wb')
for chunk in res.iter_content():
pic.write(chunk)
pic.close()
# res = requests.get("http://imgsmall.dmzj.com/w/7474/27922/0.jpg", headers=headers)
# pic = open('t/pic.jpg', 'wb')
# for chunk in res.iter_content():
# pic.write(chunk)
# pic.close()
```
![un.png](data/attachment/forum/202003/09/144245shak72n2w7wkzv2v.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/600 "un.png")
没有多线程,凑活着用。86话爬三分钟。
[/md] |
-
|