'NLTK' 태그의 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

글쓰기
방명록
RSS
관리

목록NLTK (1)

juooo1117

Text pre-processing (cranfieldDocs)

cranfieldDocs 파일(.txt)들을 이용해서 text pre-processing 하는 과정 이용한 전처리 방법들은 다음과 같다. Remove markups Convert to lowercase 특정 tag안의 내용만 가져오기 Remove punctuation, number Tokenization Practice 필요한 패키지를 import하고 'cranfieldDocs'를 불러온 뒤, 파일 안의 line들을 하나의 string안에 각각 길게 저장한다. from bs4 import BeautifulSoup import string from nltk.stem import PorterStemmer # read file doc = "" for line in open('/Users/juhyeon/pyth..

Artificial Intelligence 2023. 10. 27. 15:19

이전 Prev 1 Next 다음

목록NLTK (1)

juooo1117

티스토리툴바