HTML에서 주석 내용을 삭제하는 Java 메소드

저자：Eve Cole 업데이트 시간：2025-03-14 00:16:02

실제로 HTML 텍스트에서 주석을 삭제하는 방법에는 여러 가지가 있습니다.

HTML 텍스트에는 몇 가지 주석 특성이 있습니다.

1. 쌍으로 나타나면 시작하면 끝이 발생합니다.

2. 주석 태그가 중첩되지 않았으며 주석 시작 태그 (<!-) 다음은 해당 엔드 태그 (여기서->)이어야합니다.

3. 한 줄에 여러 개의 주석 태그 쌍이있을 수 있습니다.

4. 의견도 깨질 수 있습니다.

대략 다음과 같은 상황이 있습니다.

코드 사본은 다음과 같습니다.

<html>

<!-이것은 머리입니다->

<!-이것은입니다

div->

<!-이것은입니다

스팬-> <!-스팬

div-> <div> a div </div>

<!-이것은 a입니다

span-> <div> a div </div> <!-div->에 스팬

<html>

아이디어 :

1. 한 번에 한 줄의 텍스트를 읽으십시오.

2. 줄에만 <!-및->와 <!-만 포함 된 경우. 두 태그간에 주석 내용을 직접 삭제하고 다른 컨텐츠를 얻습니다.

3. 줄에 <!-and->, 그러나 <!-> 만 포함 된 경우. 두 태그 사이에 내용을 가져 오면 태그가 <!- 태그가 발생했습니다.

4. 줄에만 <!-만 포함 된 경우 태그 전에 컨텐츠를 가져오고 태그에 <!-태그가 발생했습니다.

5. 줄에 -> 만 포함 된 경우 태그 뒤에 내용을 가져 오면 태그가 -> 태그가 발생했습니다.

6. 라인의 나머지 내용에 대해 2, 3, 4 및 5 단계를 실행하십시오.

7. 나머지를 저장하십시오.

8. 다음 줄을 읽으십시오.

다음과 같이 코드를 복사하십시오 : 공개 클래스 htmlcommenthandler {

/**

* HTML 컨텐츠에 주석이 달린 검출기

* @Author Boyce

* @version 2013-12-3

개인 정적 클래스 htmlcommentDetector {

개인 정적 최종 문자열 comment_start = "<!-";

개인 정적 최종 문자열 comment_end = "->";

//이 문자열 주석 줄은 주석이 달린 다음 주석의 시작 태그와 엔드 태그 "<!-->"을 포함합니다.

개인 정적 부울 iscommentline (String line) {

return containscommentStartTag (line) && incless commentendTag (line)

&& line.indexof (comment_start) <line.indexof (comment_end);

}

// 주석의 시작 태그를 포함할지 여부

개인 정적 부울은 CommentStarttag (문자열 라인)를 포함합니다.

Return StringUtils.isnotempty (line) &&

line.indexof (comment_start)! = -1;

}

// 주석 엔드 태그를 포함할지 여부

개인 정적 부울은 commentendTag (문자열 라인) {

Return StringUtils.isnotempty (line) &&

line.indexof (comment_end)! = -1;

}

/**

*이 줄에서 주석을 삭제하십시오

개인 정적 문자열 deletecommentInline (문자열 선) {

while (iscommentline (line)) {

int start = line.indexof (comment_start) + comment_start.length ();

int end = line.indexof (comment_end);

line = line.substring (시작, 끝);

}

리턴 라인;

}

// 시작 주석 기호 전에 내용을 가져옵니다

개인 정적 문자열 getBeforecommentContent (문자열 행) {

if (! containscommentStartTag (line))

리턴 라인;

return line.substring (0, line.indexof (comment_start));

}

// 종료 주석 줄 뒤에 내용을 가져옵니다

개인 정적 문자열 getAfterCommentContent (문자열 라인) {

if (! containsendTag (line))

리턴 라인;

return line.substring (line.indexof (comment_end) + comment_end.length ());

}

/**

* HTML 컨텐츠를 읽고 주석을 제거하십시오

public static string readhtmlcontentwithoutcomment (bufferedreader reader)는 ioexception {

StringBuilder Builder = New StringBuilder ();

문자열 라인 = null;

// 주석의 현재 줄입니다

부울 수입 = 거짓;

while (ObjectUtils.isnotnull (line = reader.readline ())) {

// 주석 태그가 포함 된 경우

while (htmlcommentDetector.containsCommentStartTag (line) ||

htmlcommentdetector.containscommentendTag (line)) {

// 쌍으로 나타나는 주석 태그 사이의 내용 삭제

// <!-댓글->

if (htmlcommentDetector.iscommentline (line)) {

line = htmlcommentdetector.deletecommentinline (line);

}

// 댓글 줄이 아니지만 시작 레이블과 엔드 레이블이 여전히 존재하는 경우 끝 레이블은 시작 레이블 앞에 있어야합니다.

// xxx-> content <!-

else if (htmlcommentDetector.containsCommentStartTag (line) && htmlcommentDetector.containsCommentEndTag (line)) {

// 엔드 태그를 얻은 후, 시작 태그가 설정되기 전에 텍스트가 설정되어 수입을 true로 설정합니다.

line = htmlcommentDetector.getAfterCommentContent (line);

line = htmlcommentDetector.getBeforecommentContent (line);

소득 = 참;

}

// 댓글 태그가 중첩을 지원하지 않기 때문에 시작 태그 만 존재하는 경우 시작 태그 만있는 선은 확실히 수입이 아닙니다.

// content <!-

else if (! incomment && htmlcommentDetector.containsCommentStartTag (line)) {

// 수입을 참으로 설정합니다. 시작 태그 전에 컨텐츠를 얻습니다

소득 = 참;

line = htmlcommentDetector.getBeforecommentContent (line);

}

// 댓글 태그가 중첩을 지원하지 않기 때문에 엔드 태그 만 존재하는 경우 엔드 태그 라인 만 수입이어야합니다.

//-> 컨텐츠

else if (incomment && htmlcommentDetector.containsCommentEndTag (line)) {

// 소득을 거짓으로 설정합니다. 엔드 태그 후에 컨텐츠를 가져옵니다

소득 = 거짓;

line = htmlcommentDetector.getAfterCommentContent (line);

}

//이 라인의 제작되지 않은 내용을 저장합니다

if (stringUtils.isnotempty (line))

Builder.Append (라인);

}

// 댓글에 주석 태그가없는 줄을 저장 = 거짓

if (stringUtils.isnotempty (line) &&! incomment)

Builder.Append (라인);

}

return builder.tostring ();

}

물론, 일반 일치를 통해 삭제할 수있는 다른 많은 방법이 있거나 스택 태그로 시작하고 끝낼 수 있습니다.

잠깐, 위의 코드는 테스트 및 사용되었으며 도움이 필요한 학생들에게 유용하기를 바랍니다.