Sindbad~EG File Manager

Current Path : /usr/local/lib/python3.12/site-packages/pandas/io/sas/__pycache__/
Upload File :
Current File : //usr/local/lib/python3.12/site-packages/pandas/io/sas/__pycache__/sas_xport.cpython-312.pyc

�

Mٜg;�	�H�dZddlmZddlmZddlmZddlZddlmZddl	Z	ddl
Zddlm
Z
ddlmZddlZdd	lmZdd
lmZerddlmZmZmZmZdZd
ZdZdZgd�ZdZ dZ!dZ"dZ#de �de"�de!�de#�d�	Z$de �de!�d�Z%dZ&d!d�Z'd"d�Z(d�Z)d�Z*Gd�d eejV�Z,y)#a-
Read a SAS XPort format file into a Pandas DataFrame.

Based on code from Jack Cushman (github.com/jcushman/xport).

The file format is defined here:

https://support.sas.com/content/dam/SAS/support/en/technical-papers/record-layout-of-a-sas-version-5-or-6-data-set-in-sas-transport-xport-format.pdf
�)�annotations)�abc)�datetimeN)�
TYPE_CHECKING)�Appender)�find_stack_level)�
get_handle)�
ReaderBase)�CompressionOptions�DatetimeNaTType�FilePath�
ReadBufferzPHEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000  zKHEADER RECORD*******MEMBER  HEADER RECORD!!!!!!!000000000000000001600000000zPHEADER RECORD*******DSCRPTR HEADER RECORD!!!!!!!000000000000000000000000000000  zPHEADER RECORD*******OBS     HEADER RECORD!!!!!!!000000000000000000000000000000  )�ntype�nhfun�field_length�nvar0�name�label�nform�nfl�num_decimals�nfj�nfill�niform�nifl�nifd�npos�_z�Parameters
----------
filepath_or_buffer : str or file-like object
    Path to SAS file or object implementing binary read method.z�index : identifier of index column
    Identifier of column that should be used as index of the DataFrame.
encoding : str
    Encoding for text data.
chunksize : int
    Read file `chunksize` lines at a time, returns iterator.zBformat : str
    File format, only `xport` is currently supported.z\iterator : bool, default False
    Return XportReader object for reading file incrementally.z#Read a SAS file into a DataFrame.

�
a

Returns
-------
DataFrame or XportReader

Examples
--------
Read a SAS Xport file:

>>> df = pd.read_sas('filename.XPT')

Read a Xport file in 10,000 line chunks:

>>> itr = pd.read_sas('filename.XPT', chunksize=10000)
>>> for chunk in itr:
>>>     do_something(chunk)

z$Class for reading SAS Xport files.

z�

Attributes
----------
member_info : list
    Contains information about the file
fields : list
    Contains information about the variables in the file
z�Read observations from SAS Xport file, returning as data frame.

Parameters
----------
nrows : int
    Number of rows to read from data file; if None, read whole
    file.

Returns
-------
A DataFrame.
c�n�	tj|d�S#t$rtjcYSwxYw)z1Given a date in xport format, return Python date.z%d%b%y:%H:%M:%S)r�strptime�
ValueError�pd�NaT)�datestrs �B/usr/local/lib/python3.12/site-packages/pandas/io/sas/sas_xport.py�_parse_dater'�s3���� � ��*;�<�<�����v�v�
��s��4�4c�d�i}d}|D]#\}}||||zj�||<||z
}�%|d=|S)a
    Parameters
    ----------
    s: str
        Fixed-length string to split
    parts: list of (name, length) pairs
        Used to break up string, name '_' will be filtered from output.

    Returns
    -------
    Dict of name:contents of string at given location.
    rr)�strip)�s�parts�out�startr�lengths      r&�_split_liner/�sP��
�C�
�E����f��e�e�f�n�-�3�3�5��D�	�
�����	�C���J�c���|dk7ritjt|�tjd��}tjd|�dd|z
���}|j	|��}||d<|S|S)N��S8�Sz,S��dtype�f0)�np�zeros�lenr6�view)�vec�nbytes�vec1r6�vec2s     r&�_handle_truncated_float_vecr@�si����{��x�x��C��"�(�(�4�.�1�����1�V�H�B�q�6�z�l�3�4���y�y�u�y�%����T�
����Jr0c��tjd�}|j|��}|d}|d}|dz}tjt	|�tj
��}d|tj|dz�<d|tj|d	z�<d
|tj|dz�<||z}||z	|dzd
d
|z
zzz}|dz}||dz	dzdz
dz|zdzdz|dzzz}tjt	|�fd��}||d<||d<|jd��}|jd�}|S)zf
    Parse a vector of float values representing IBM 8 byte floats into
    native 8 byte floats.
    z>u4,>u4r5r7�f1i����i �i@�i���l�����Ai��lz>f8�f8)	r8r6r;r9r:�uint8�where�empty�astype)	r<r6r>�xport1�xport2�ieee1�shift�ieee2�ieees	         r&�_parse_float_vecrW�sd��

�H�H�Y��E��8�8�%�8� �D�
�$�Z�F�
�$�Z�F�
�Z��E�
�H�H�S��X�R�X�X�.�E�+,�E�"�(�(�6�J�&�
'�(�+,�E�"�(�(�6�J�&�
'�(�+,�E�"�(�(�6�J�&�
'�(�
�e�O�E�
�u�_�&�:�"5�2��U��;K�!L�M�E�
�Z��E�
�6�R�<�4�'�2�-�!�3�u�<�t�C��J������E��8�8�S��Z�M��3�D��D��J��D��J��9�9�5�9�!�D��;�;�t��D��Kr0c��eZdZeZ				d									dd�Zd
d�Zd�Zd
d�Zdd�Z	dd�Z
ddd�Zd	�Ze
e�ddd
��Zy)�XportReaderNc��||_d|_||_||_t	|d|d|��|_|j
j|_	|j�y#t$r|j��wxYw)Nr�rbF)�encoding�is_text�compression)�	_encoding�_lines_read�_index�
_chunksizer	�handles�handle�filepath_or_buffer�_read_header�	Exception�close)�selfre�indexr\�	chunksizer^s      r&�__init__zXportReader.__init__s|��"���������#���!�����#�
���#'�,�,�"5�"5���	�������	��J�J�L��	�s�A�A:c�8�|jj�y�N)rcrh�ris r&rhzXportReader.closes�������r0c�T�|jjd�j�S)N�P)re�read�decoderos r&�_get_rowzXportReader._get_row s"���&�&�+�+�B�/�6�6�8�8r0c
���|jjd�|j�}|tk7rd|vrt	d��t	d��|j�}ddgddgd	dgd
dgddgg}t||�}|dd
k7rt	d��t
|d�|d<||_|j�}t
|dd�|d<|j�}|j�}|jt�}|tk(}	|r|	st	d��t|dd�}
ddgddgddgddgd	dgd
dgddgg}t|j�|�}ddgd
dgddgddgg}|jt|j�|��t
|d�|d<t
|d�|d<||_
ddd�}
t|j�dd�}|
|z}|dzr|d|dzz
z
}|jj|�}g}d}t|�|
k\r�|d|
||
d}}|j!d�}t#j$d|�}t't)t*|��}|d
=|
|d |d <|d!}|d dk(r|d"ks|dkDrd#|�d$�}t-|��|j/�D]\}}	|j1�||<�||d!z
}||gz
}t|�|
k\r��|j�}|t4k(st	d%��||_||_|jj;�|_|j?�|_ |j6D�cgc]}|d&jC���c}|_"tG|j6�D��cgc]$\}}d'tI|�zd(tI|d!�zf��&}}}tKjL|�}||_'y#t2$rY��<wxYwcc}wcc}}w))Nrz**COMPRESSED**z<Header record indicates a CPORT file, which is not readable.z#Header record is not an XPORT file.�prefixrH�versionr2�OSr�created�zSAS     SAS     SASLIBz!Header record has invalid prefix.�modifiedzMember header not found�������set_name�sasdatar�(�type�numeric�char)rCrD�6�:rq�z>hhhh8s40s8shhh2s8shhl52srrrDzFloating field width z is not between 2 and 8.zObservation header not found.rr*r4)(re�seekrt�_correct_line1r"r/r'�	file_info�
startswith�_correct_header1�_correct_header2�int�update�member_inforrr:�ljust�struct�unpack�dict�zip�
_fieldkeys�	TypeError�itemsr)�AttributeError�_correct_obs_header�fields�
record_length�tell�record_start�
_record_count�nobsrs�columns�	enumerate�strr8r6�_dtype)ri�line1�line2�fifr��line3�header1�header2�	headflag1�	headflag2�fieldnamelength�memr��types�
fieldcount�
datalength�	fielddatar��
obs_length�
fieldbytes�fieldstruct�field�fl�msg�k�v�header�x�i�dtypelr6s                               r&rfzXportReader._read_header#s]�����$�$�Q�'��
�
����N�"��5�(�!�R����B�C�C��
�
����"�~�	�1�~��a�y�3��)�i�QS�_�U����s�+�	��X��":�:��@�A�A�*�9�Y�+?�@�	�)��"����
�
��� +�E�#�2�J� 7�	�*���-�-�/���-�-�/���&�&�'7�8�	��/�/�	��i��6�7�7��g�b��n�-���q�M�
��O�
��N�
��N�
�1�I�
�"�I�
��O�
��"�$�-�-�/�3�7���B��#�r��W�b�M�F�A�;�G�����;�t�}�}���<�=�"-�k�*�.E�"F��J��!,�[��-C�!D��I��&����&�)��������B�/�0�
�$�z�1�
���?��"�z�B��.�.�J��+�+�0�0��<�	����
��)�n��/��*�?�+��/�*�+�"�J�$�)�)�#�.�J� �-�-�(C�Z�P�K���Z��5�6�E��c�
�"�5��>�2�E�'�N��~�&�B��W�~��*��a��R�!�V�-�b�T�1I�J����n�$����
���1�� �w�w�y�E�!�H�&�
�%��/�/�J��u�g��F�7�)�n��/�:������,�,��<�=�=����'��� �3�3�8�8�:����&�&�(��	�48�K�K�@�K�q��&�	�(�(�*�K�@���
&�d�k�k�2�
�2���5��3�q�6�\�3��U�>�%:�!;�;�<�2�	�
����� ������/&�����A��
s�.O�O'�)O,�	O$�#O$c�B�|j|jxsd��S)NrC��nrows)rrrbros r&�__next__zXportReader.__next__�s���y�y�t���3�!�y�4�4r0c���|jjdd�|jj�|jz
}|dzdk7rt	j
dt
���|jdkDr4|jj|j�||jzS|jjdd�|jjd�}tj|tj��}tj|dk(�}t|�dk(rd}nd	t|�z}|jj|j�||z
|jzS)
z�
        Get number of records in file.

        This is maybe suboptimal because we have to seek to the end of
        the file.

        Side effect: returns file position to record_start.
        rrDrqzxport file may be corrupted.)�
stackleveli����r5l  @@�r2)rer�r�r��warnings�warnrr�rrr8�
frombuffer�uint64�flatnonzeror:)ri�total_records_length�last_card_bytes�	last_card�ix�tail_pads      r&r�zXportReader._record_count�s:��	
���$�$�Q��*�#�6�6�;�;�=��@Q�@Q�Q���"�$��)��M�M�.�+�-�
�
����"��#�#�(�(��):�):�;�'�4�+=�+=�=�=����$�$�S�!�,��1�1�6�6�r�:���M�M�/����C�	��^�^�I�)<�<�
=���r�7�a�<��H��3�r�7�{�H����$�$�T�%6�%6�7�$�x�/�D�4F�4F�F�Fr0c�B�|�|j}|j|��S)a
        Reads lines from Xport file and returns as dataframe

        Parameters
        ----------
        size : int, defaults to None
            Number of lines to read.  If None, reads whole file.

        Returns
        -------
        DataFrame
        r�)rbrr)ri�sizes  r&�	get_chunkzXportReader.get_chunk�s#���<��?�?�D��y�y�t�y�$�$r0c��|jd��}|ddk(|ddk(z|ddk(z}|ddk\|dd	kz|dd
k(z|ddk(z}||z}|S)Nzu1,u1,u2,u4r5rBr�f2�f3r7rJ�Z�_�.)r;)rir<r��miss�miss1s     r&�_missing_doublezXportReader._missing_double�s����H�H�=�H�)���$��1���4��A��.�!�D�'�Q�,�?����g��o�!�D�'�T�/�
2���w�$��
 ���w�$��
 �	�
	
��
���r0c�|�|�|j}t||j|jz
�}||jz}|dkr|j	�t
�|jj|�}tj||j|��}i}t|j�D]�\}}|dt|�z}	|j|d}
|
dk(rLt|	|j|d�}	|j!|	�}t#|	�}tj$||<nf|j|ddk(rQ|	D�
cgc]}
|
j'���}}
|j(�(|D�
cgc]}
|
j+|j(���}}
|j-|i���t/j0|�}|j2�<t/j4t7|j|j|z��|_n|j;|j2�}|xj|z
c_|Scc}
wcc}
w)Nr)r6�countr*rr�rr�)r��minr`r�rh�
StopIterationrerrr8r�r�r�r�r�r�r@r�rW�nan�rstripr_rsr�r#�	DataFramera�Index�rangerj�	set_index)rir��
read_lines�read_len�raw�data�df_data�jr�r<rr�r��y�dfs               r&rrzXportReader.read�s����=��I�I�E����	�	�D�,<�,<� <�=�
��� 2� 2�2���q�=��J�J�L����%�%�*�*�8�4���}�}�S����:�F�����d�l�l�+�D�A�q��s�S��V�|�$�C��K�K��N�7�+�E��	�!�1�#�t�{�{�1�~�n�7U�V���+�+�C�0��$�S�)���&�&��$�����Q���(�F�2�),�-��A�Q�X�X�Z���-��>�>�-�;<�=�1�a����$�.�.�1�1�A�=��N�N�A�q�6�"�,��\�\�'�
"���;�;���x�x��d�&6�&6��8H�8H�:�8U� V�W�B�H����d�k�k�*�B����J�&���	��.��>s�H4�,"H9)Nz
ISO-8859-1N�infer)
rezFilePath | ReadBuffer[bytes]r\z
str | Nonerk�
int | Noner^r�return�None)r�r�)r��pd.DataFrame)r�r�rn)r�r�r�r�)r�r�r�r�)�__name__�
__module__�__qualname__�_xport_reader_doc�__doc__rlrhrtrfr�r�r�r�r�_read_method_docrr�r0r&rYrY�s����G�
�+� $�*1�
�8���	�
��(�
�
��8�9�l�\5�$G�L%�"	����%� �%r0rY)r%r�r�r)r*r�)-r��
__future__r�collectionsrrr��typingrr��numpyr8�pandas.util._decoratorsr�pandas.util._exceptionsr�pandasr#�pandas.io.commonr	�pandas.io.sas.sasreaderr
�pandas._typingrrr
rr�r�r�r�r��_base_params_doc�_params2_doc�_format_params_doc�
_iterator_doc�
_read_sas_docr�r�r'r/r@rW�IteratorrYr�r0r&�<module>rs���#���
� ���,�4��'�.����'��
R��'��
'���
�(C��@��9��A�
�
�������
�������
�2����
���	������,�&6�r~�*�c�l�l�~r0

Sindbad File Manager Version 1.0, Coded By Sindbad EG ~ The Terrorists