scpscraper 다운로드 scpscraper 소스 코드 다운로드

scpscraper

AI 소스 코드

v1.0.1)

다운로드

SCP 스크레이퍼

SCP Wiki의 데이터를 폐기하도록 설계된 작은 파이썬 라이브러리. AI 교육 (즉, NLP 모델) 및 데이터 세트 컬렉션 (외부 프로젝트를위한 SCP 분류와 같은 것)으로 제작되었으며 해당 응용 프로그램에서 쉽게 사용할 수 있도록 주장이 있습니다.

아래에는 설치 지침,이 라이브러리 사용 방법의 예제 및이를 활용할 수있는 방법을 찾을 수 있습니다. 나는 당신이 내가 가진 것처럼 유용하다는 것을 알기를 바랍니다!

샘플 코드

설치

scpscraper pip install 통해 설치할 수 있습니다. 다음은 내가 사용하는 권장 명령이 있으므로 지속적으로 최신 버전이 있습니다.

 pip3 install --upgrade scpscraper

기본

도서관 가져 오기

 # Before we begin, we obviously have to import scpscraper.
import scpscraper

SCP의 이름을 잡습니다

 # Let's use 3001 (Red Reality) as an example.
name = scpscraper . get_scp_name ( 3001 )

print ( name ) # Outputs "Red Reality"

SCP에 대한 가능한 많은 세부 사항을 잡습니다

 # Again using 3001 as an example
info = scpscraper . get_scp ( 3001 )

print ( info ) # Outputs a dictionary with the
# name, object id, rating, page content by section, etc.

재미있는 것들

SCP의 `page-content` DIV HTML을 잡습니다

참고로, page-content Div에는 추가 Wikidot 외부 재료없이 사용자가 실제로 작성한 내용이 포함되어 있습니다.

 # Once again, 3001 is the example
scp = scpscraper . get_single_scp ( 3001 )

# Grab the page-content div specifically
content = scp . find_all ( 'div' , id = 'page-content' )

print ( content ) # Outputs "<div id="page-content"> ... </div>"

여러 SCP의 HTML 또는 정보를 스크래핑합니다

 # Grab info on SCPs 000-099
scpscraper . scrape_scps ( 0 , 100 )

# Same as above, but only grabbing Keter-class SCPs
scpscraper . scrape_scps ( 0 , 100 , tags = [ 'keter' ])

# Grab 000-099 in a format that can be used to train AI
scpscraper . scrape_scps ( 0 , 100 , ai_dataset = True )

 # Scrape the page-content div's HTML from SCP-000 to SCP-099

# Only including this as an example, but scrape_scps_html() has
# all the same options as scrape_scps().
scpscraper . scrape_scps_html ( 0 , 100 )

Google 공동 작업 전용 사용

Google 공동 작업에 포함 된 google.colab 모듈로 인해 그렇지 않으면 몇 가지 추가 작업을 수행 할 수 있습니다.

Google 드라이브를 공동 작업 VM으로 마운트하십시오

 # Mounts it to the directory /content/drive/
scpscraper . gdrive . mount ()

SCP Info/HTML을 긁어 내고 나중에 Google 드라이브에 복사하십시오.

 # Requires your Google Drive to be mounted at the directory /content/drive/
scpscraper . scrape_scps ( 0 , 100 , copy_to_drive = True )

scpscraper . scrape_scps_html ( 0 , 100 , copy_to_drive = True )

Google 드라이브에 다른 파일을 복사하십시오

 # Requires your Google Drive to be mounted at the directory /content/drive/
scpscraper . gdrive . copy_to_drive ( 'example.txt' )

scpscraper . gdrive . copy_from_drive ( 'example.txt' )

계획된 업데이트

향후 모든 웹 사이트에서 데이터를 긁어 내기 위해 잠재적 인 업데이트를 통해 쉽게 데이터 수집 할 수 있습니다.

Github Repo 링크

체크 아웃하는 것을 고려하십시오! GitHub Repo에서 문제를보고하고, 기능을 요청하고,이 프로젝트에 기여할 수 있습니다. 이것이이 프로젝트와 관련된 문제/피드백을 위해 저에게 다가가는 가장 좋은 방법입니다.

https://github.com/jaonhax/scpscraper/

확장하다

추가 정보

버전 v1.0.1)
유형 AI 소스 코드
업데이트 시간 2025-09-12
크기 14.54KB
출처 Github

scpscraper

SCP 스크레이퍼

샘플 코드

설치

기본

도서관 가져 오기

SCP의 이름을 잡습니다

SCP에 대한 가능한 많은 세부 사항을 잡습니다

재미있는 것들

SCP의 `page-content` DIV HTML을 잡습니다

여러 SCP의 HTML 또는 정보를 스크래핑합니다

Google 공동 작업 전용 사용

Google 드라이브를 공동 작업 VM으로 마운트하십시오

SCP Info/HTML을 긁어 내고 나중에 Google 드라이브에 복사하십시오.

Google 드라이브에 다른 파일을 복사하십시오

계획된 업데이트

Github Repo 링크

ML stack

awesome free chatgpt

pywin_contextmenu

promptl

tick.chat

FastLoRAChat

chat.petals.dev

GPT Prompt Templates

GPTyped

ML stack

awesome free chatgpt

pywin_contextmenu

Google Dorks

shepherd

mongo express

scpscraper

SCP 스크레이퍼

샘플 코드

설치

기본

도서관 가져 오기

SCP의 이름을 잡습니다

SCP에 대한 가능한 많은 세부 사항을 잡습니다

재미있는 것들

SCP의 page-content DIV HTML을 잡습니다

여러 SCP의 HTML 또는 정보를 스크래핑합니다

Google 공동 작업 전용 사용

Google 드라이브를 공동 작업 VM으로 마운트하십시오

SCP Info/HTML을 긁어 내고 나중에 Google 드라이브에 복사하십시오.

Google 드라이브에 다른 파일을 복사하십시오

계획된 업데이트

Github Repo 링크

SCP의 `page-content` DIV HTML을 잡습니다