Sindbad~EG File Manager

Current Path : /usr/local/lib/python3.12/site-packages/bs4/builder/__pycache__/
Current File : //usr/local/lib/python3.12/site-packages/bs4/builder/__pycache__/_htmlparser.cpython-312.pyc
�

(ٜgK:���dZdZdgZddlmZddlZddlZddlmZm	Z	m
Z
mZmZddl
mZmZddlmZmZmZmZmZd	ZGd
�dee�ZGd�de�Zy)
zCUse the HTMLParser library to parse HTML files that aren't too bad.�MIT�HTMLParserTreeBuilder�)�
HTMLParserN)�CData�Comment�Declaration�Doctype�ProcessingInstruction)�EntitySubstitution�
UnicodeDammit)�DetectsXMLParsedAsHTML�ParserRejectedMarkup�HTML�HTMLTreeBuilder�STRICTzhtml.parserc�d�eZdZdZdZdZd�Zd�Zd�Zdd�Z	dd�Z
d	�Zd
�Zd�Z
d�Zd
�Zd�Zd�Zy)�BeautifulSoupHTMLParserz�A subclass of the Python standard library's HTMLParser class, which
    listens for HTMLParser events and translates them into calls
    to Beautiful Soup's tree construction API.
    �ignore�replacec��|jd|j�|_tj|g|��i|��g|_|j
�y)aConstructor.

        :param on_duplicate_attribute: A strategy for what to do if a
            tag includes the same attribute more than once. Accepted
            values are: REPLACE (replace earlier values with later
            ones, the default), IGNORE (keep the earliest value
            encountered), or a callable. A callable must take three
            arguments: the dictionary of attributes already processed,
            the name of the duplicate attribute, and the most recent value
            encountered.           
        �on_duplicate_attributeN)�pop�REPLACErr�__init__�already_closed_empty_element�_initialize_xml_detector)�self�args�kwargss   �B/usr/local/lib/python3.12/site-packages/bs4/builder/_htmlparser.pyrz BeautifulSoupHTMLParser.__init__.sN��'-�j�j�$�d�l�l�'
��#�	���D�2�4�2�6�2�-/��)��%�%�'�c��t|��)N)r)r�messages  r �errorzBeautifulSoupHTMLParser.errorJs��#�7�+�+r!c�N�|j||d��}|j|�y)z�Handle an incoming empty-element tag.

        This is only called when the markup looks like <tag/>.

        :param name: Name of the tag.
        :param attrs: Dictionary of the tag's attributes.
        F)�handle_empty_elementN)�handle_starttag�
handle_endtag)r�name�attrs�tags    r �handle_startendtagz*BeautifulSoupHTMLParser.handle_startendtagZs)���"�"�4��U�"�K�����4� r!c���i}|D]Q\}}|�d}||vr=|j}||jk(rn&|d|jfvr|||<n||||�n|||<d}�S|j�\}	}
|jj|dd||	|
��}|r<|jr0|r.|j|d��|jj|�|j�|j|�yy)a3Handle an opening tag, e.g. '<tag>'

        :param name: Name of the tag.
        :param attrs: Dictionary of the tag's attributes.
        :param handle_empty_element: True if this tag is known to be
            an empty-element tag (i.e. there is not expected to be any
            closing tag).
        N�z"")�
sourceline�	sourceposF)�check_already_closed)r�IGNOREr�getpos�soupr'�is_empty_elementr(r�append�	_root_tag�_root_tag_encountered)rr)r*r&�	attr_dict�key�value�on_dupe�	attrvaluer/r0r+s            r r'z'BeautifulSoupHTMLParser.handle_starttagis���	��J�C���}����i���5�5���d�k�k�)����t�|�|� 4�4�%*�I�c�N��I�s�E�2�!&�	�#���I�% �(!%���
��
�I��i�i�'�'��$��i�J��(�
���3�'�'�,@�
���t�%��@�
�-�-�4�4�T�:��>�>�!��&�&�t�,�"r!c��|r*||jvr|jj|�y|jj|�y)z�Handle a closing tag, e.g. '</tag>'
        
        :param name: A tag name.
        :param check_already_closed: True if this tag is expected to
           be the closing portion of an empty-element tag,
           e.g. '<tag></tag>'.
        N)r�remover4r()rr)r1s   r r(z%BeautifulSoupHTMLParser.handle_endtag�s<�� �D�D�,M�,M�$M�

�-�-�4�4�T�:��I�I�#�#�D�)r!c�:�|jj|�y)z4Handle some textual data that shows up between tags.N)r4�handle_data�r�datas  r rAz#BeautifulSoupHTMLParser.handle_data�s���	�	���d�#r!c��|jd�rt|jd�d�}n8|jd�rt|jd�d�}nt|�}d}|dkr<|jjdfD]!}|s�	t|g�j
|�}�#|s	t|�}|xsd}|j|�y#t$r
}Yd}~�Xd}~wwxYw#ttf$r
}Yd}~�Bd}~wwxYw)z�Handle a numeric character reference by converting it to the
        corresponding Unicode character and treating it as textual
        data.

        :param name: Character number, possibly in hexadecimal.
        �x��XN�zwindows-1252u�)�
startswith�int�lstripr4�original_encoding�	bytearray�decode�UnicodeDecodeError�chr�
ValueError�
OverflowErrorrA)rr)�	real_namerC�encoding�es      r �handle_charrefz&BeautifulSoupHTMLParser.handle_charref�s����?�?�3���D�K�K��,�b�1�I�
�_�_�S�
!��D�K�K��,�b�1�I��D�	�I����s�?�"�Y�Y�8�8�.�I�����$�i�[�1�8�8��B�D�	J��
��9�~���2�2��������*������
�
�.�
���
�s$�C�,C%�	C"�C"�%C>�9C>c�x�tjj|�}|�|}nd|z}|j|�y)z�Handle a named entity reference by converting it to the
        corresponding Unicode character(s) and treating it as textual
        data.

        :param name: Name of the entity reference.
        Nz&%s)r�HTML_ENTITY_TO_CHARACTER�getrA)rr)�	characterrCs    r �handle_entityrefz(BeautifulSoupHTMLParser.handle_entityref�s>��'�?�?�C�C�D�I�	�� ��D��4�<�D�����r!c��|jj�|jj|�|jjt�y)zOHandle an HTML comment.

        :param data: The text of the comment.
        N)r4�endDatarArrBs  r �handle_commentz&BeautifulSoupHTMLParser.handle_comment�s8��
	
�	�	�����	�	���d�#��	�	���'�"r!c���|jj�|td�d}|jj|�|jjt�y)zYHandle a DOCTYPE declaration.

        :param data: The text of the declaration.
        zDOCTYPE N)r4r]�lenrAr	rBs  r �handle_declz#BeautifulSoupHTMLParser.handle_decl�sI��
	
�	�	�����C�
�O�$�%���	�	���d�#��	�	���'�"r!c��|j�jd�rt}|td�d}nt}|j
j
�|j
j|�|j
j
|�y)z{Handle a declaration of unknown type -- probably a CDATA block.

        :param data: The text of the declaration.
        zCDATA[N)�upperrIrr`rr4r]rA)rrC�clss   r �unknown_declz$BeautifulSoupHTMLParser.unknown_declsf��
�:�:�<�"�"�8�,��C���H�
��'�D��C��	�	�����	�	���d�#��	�	���#�r!c���|jj�|jj|�|j|�|jjt�y)z\Handle a processing instruction.

        :param data: The text of the instruction.
        N)r4r]rA�_document_might_be_xmlr
rBs  r �	handle_piz!BeautifulSoupHTMLParser.handle_pisG��
	
�	�	�����	�	���d�#��#�#�D�)��	�	���/�0r!N)T)�__name__�
__module__�__qualname__�__doc__r2rrr$r,r'r(rArVr[r^rarerh�r!r rr$sQ����F��G�(�8,� 
!�5-�n*�$$�&�P�&#�#��1r!rc�P��eZdZdZdZdZeZeee	gZ
dZd�fd�	Z		dd�Z
d�Z�xZS)	rzpA Beautiful soup `TreeBuilder` that uses the `HTMLParser` parser,
    found in the Python standard library.
    FTc����t�}dD]}||vs�|j|�}|||<�tt|�di|��|xsg}|xsi}|j|�d|d<||f|_y)a�Constructor.

        :param parser_args: Positional arguments to pass into 
            the BeautifulSoupHTMLParser constructor, once it's
            invoked.
        :param parser_kwargs: Keyword arguments to pass into 
            the BeautifulSoupHTMLParser constructor, once it's
            invoked.
        :param kwargs: Keyword arguments for the superclass constructor.
        )rF�convert_charrefsNrm)�dictr�superrr�update�parser_args)rrt�
parser_kwargsr�extra_parser_kwargs�argr;�	__class__s       �r rzHTMLParserTreeBuilder.__init__*s����#�f��.�C��f�}��
�
�3���+0�#�C�(�/�	�#�T�3�=�f�=�!�'�R��%�+��
����0�1�,1�
�(�)�'��7��r!c#��K�t|t�r	|dddf��y|g}|g}||g}t|||d|��}|j|j|j
|jf��y�w)a�Run any preliminary steps necessary to make incoming markup
        acceptable to the parser.

        :param markup: Some markup -- probably a bytestring.
        :param user_specified_encoding: The user asked to try this encoding.
        :param document_declared_encoding: The markup itself claims to be
            in this encoding.
        :param exclude_encodings: The user asked _not_ to try any of
            these encodings.

        :yield: A series of 4-tuples:
         (markup, encoding, declared encoding,
          has undergone character replacement)

         Each 4-tuple represents a strategy for converting the
         document to Unicode and parsing it. Each strategy will be tried 
         in turn.
        NFT)�known_definite_encodings�user_encodings�is_html�exclude_encodings)�
isinstance�strr�markuprL�declared_html_encoding�contains_replacement_characters)	rr��user_specified_encoding�document_declared_encodingr}rzr{�
try_encodings�dammits	         r �prepare_markupz$HTMLParserTreeBuilder.prepare_markupCs�����*�f�c�"��4��u�-�-��%<�#<� �5�5��0�2L�M�
���%=�)��/�
���}�}�f�6�6��,�,��5�5�7�	7�s�A%A'c���|j\}}t|i|��}|j|_	|j|�|j	�g|_y#t
$r}t
|��d}~wwxYw)z{Run some incoming markup through some parsing process,
        populating the `BeautifulSoup` object in self.soup.
        N)rtrr4�feed�close�AssertionErrorrr)rr�rr�parserrUs      r r�zHTMLParserTreeBuilder.feedtsp���'�'���f�(�$�9�&�9���i�i���	*��K�K����L�L�N�/1��+���	*�'�q�)�)��		*�s�!A�	A/�A*�*A/)NN)NNN)rirjrkrl�is_xml�	picklable�
HTMLPARSER�NAMErr�features�TRACKS_LINE_NUMBERSrr�r��
__classcell__)rxs@r rrsF�����F��I��D��d�F�#�H���8�2>B�JN�/7�b1r!)rl�__license__�__all__�html.parserr�sys�warnings�bs4.elementrrrr	r
�
bs4.dammitrr�bs4.builderr
rrrrr�rrrmr!r �<module>r�sf��I������#�
����9����
�v1�j�*@�v1�rf1�O�f1r!
Sindbad File Manager Version 1.0, Coded By Sindbad EG ~ The Terrorists