{ "cells": [ { "cell_type": "markdown", "id": "a83d3451", "metadata": {}, "source": [ "# Лабораторная работа №3" ] }, { "cell_type": "markdown", "id": "c4626d84", "metadata": {}, "source": [ "Данная лабораторная предназначена для ознакомления с основными модулями Python используемыми в анализе данных.\n", "\n", "NumPy - модуль предназначенный для работы с многомерными массивами. Почитать можно [здесь](https://pythonworld.ru/numpy)\n", "\n", "Matplotlib - пакет модулей предназначенный для визуализации данных. Почитать можно [здесь](https://pythonworld.ru/novosti-mira-python/scientific-graphics-in-python.html) \n", "\n", "Pandas - модуль для анализа данных и поддерживающий их табличное представление. Почитать можно [здесь](https://pythonworld.ru/obrabotka-dannyx/pandas-cookbook-1-csv-reading.html)\n", "\n", "Для углубленного изучения можно почитать книгу Дж. Вандел Плас Python для сложных задач наука о данных:\n", "\n" ] }, { "cell_type": "markdown", "id": "044a3a41", "metadata": {}, "source": [ "## NumPy\n", "\n", "Данный модуль создан для ускорения работы с массивами больших размерностей. Для примера рассмотрим сравнение скорости подсчета суммы случайного ряда при помощи встроенных инструментов и инструментов numpy:" ] }, { "cell_type": "code", "execution_count": 1, "id": "fb97f953", "metadata": {}, "outputs": [], "source": [ "# импортируем модуль и создадим ему короткий псевдоним для удобства обращения к нему\n", "import numpy as np\n", "\n", "# импортируем этот модуль для генерации случайных данных\n", "import random" ] }, { "cell_type": "code", "execution_count": 2, "id": "4d629828", "metadata": {}, "outputs": [], "source": [ "# создадим список длиной 10_000 случайных целых чисел в диапазоне от -10_000 до 10_000\n", "arr = random.sample(range(-10_000, 10_000),k=10_000)" ] }, { "cell_type": "code", "execution_count": 3, "id": "0701ca1a", "metadata": {}, "outputs": [], "source": [ "# Подсчитаем время исполнения ячейки при помощи волшебного оператора %%time" ] }, { "cell_type": "code", "execution_count": 4, "id": "062a85f3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "101 µs ± 556 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n" ] } ], "source": [ "%timeit sum(arr)" ] }, { "cell_type": "code", "execution_count": 5, "id": "86298baf", "metadata": {}, "outputs": [], "source": [ "# Подсчитаем теперь преобразованный список в NumPy-массив при помощи np.sum()" ] }, { "cell_type": "code", "execution_count": 6, "id": "ee277123", "metadata": {}, "outputs": [], "source": [ "arr_2 = np.array(arr)" ] }, { "cell_type": "code", "execution_count": 7, "id": "e58e9290", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5.18 µs ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n" ] } ], "source": [ "%timeit np.sum(arr_2)" ] }, { "cell_type": "markdown", "id": "48501958", "metadata": {}, "source": [ "Как видно алгоритм подсчет суммы ускоряется почти в 20 раз" ] }, { "cell_type": "markdown", "id": "06dd75f1", "metadata": {}, "source": [ "NumPy позволяет создавать различного рода матрицы в одно действие: " ] }, { "cell_type": "code", "execution_count": 8, "id": "9930b6c6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0., 0.],\n", " [0., 0.]])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#создать матрицу размерности 2х2 заполненную нулями\n", "np.zeros((2,2))" ] }, { "cell_type": "code", "execution_count": 9, "id": "55ed68ec", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 1.],\n", " [1., 1.],\n", " [1., 1.]])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#создать матрицу размерности 3х2 заполненную единицами\n", "np.ones((3,2))" ] }, { "cell_type": "code", "execution_count": 10, "id": "5e86bd15", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2]\n", " [3 4]\n", " [5 6]]\n", "(3, 2)\n" ] } ], "source": [ "# размерность массива можно посмотреть вызвав поле shape у np.array\n", "\n", "a = np.array([[1, 2], [3, 4], [5, 6]])\n", "\n", "print(a)\n", "print(a.shape)" ] }, { "cell_type": "code", "execution_count": 11, "id": "166f7114", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3\n" ] } ], "source": [ "# len вернет нам вернет размер по первому измерению\n", "\n", "print(len(a))" ] }, { "cell_type": "markdown", "id": "885f11a7", "metadata": {}, "source": [ "### Задание\n", "\n", "Найдите в документации numpy функции для создания диагональной матрицы и заполнения матрицы пользовательским числом.\n", "\n", "1) Создайте диагональную единичную матрицу размерности 5х5\n", "\n", "2) Создайте матрицу размерности 4х4 заполненную тройками" ] }, { "cell_type": "code", "execution_count": null, "id": "6b85793e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "d8cd415d", "metadata": {}, "source": [ "Массивы можно транспонировать и изменять размерности" ] }, { "cell_type": "code", "execution_count": 12, "id": "8c9ea473", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2]\n", " [3 4]\n", " [5 6]]\n" ] } ], "source": [ "a = np.array([[1,2],[3,4],[5,6]])\n", "print(a)" ] }, { "cell_type": "code", "execution_count": 13, "id": "058cd289", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2 3]\n", " [4 5 6]]\n" ] } ], "source": [ "print(a.reshape((2,3)))" ] }, { "cell_type": "code", "execution_count": 14, "id": "0c2bd57d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 3 5]\n", " [2 4 6]]\n" ] } ], "source": [ "print(a.T)" ] }, { "cell_type": "code", "execution_count": 15, "id": "8a51aa3b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 2 3 4 5 6]\n" ] } ], "source": [ "# можно так же расплющить массив в одномерный\n", "print(a.flatten())" ] }, { "cell_type": "markdown", "id": "5f356c97", "metadata": {}, "source": [ "### Задание\n", "Создайте матрицу (любым известным вам способом) размерности 2х3х4 и транспонируйте ее. Попробуйте поменять очередность осей (по сути если представить трехмерную матрицу как куб, то это будет поворотом в пространстве на какой-то из боков) при помощи transpose" ] }, { "cell_type": "code", "execution_count": null, "id": "66d95b4c", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "3e43e5ea", "metadata": {}, "source": [ "Массивы можно объединять и добавлять новые оси" ] }, { "cell_type": "code", "execution_count": 16, "id": "fff1f7a1", "metadata": {}, "outputs": [], "source": [ "a = np.array([[1,2],[3,4]])\n", "b = np.array([[5,6]])" ] }, { "cell_type": "code", "execution_count": 17, "id": "7ae37c0a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2]\n", " [3 4]\n", " [5 6]]\n" ] } ], "source": [ "print(np.concatenate((a,b)))" ] }, { "cell_type": "code", "execution_count": 18, "id": "4aafd0ee", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "old matrix\n", "[[1 2]\n", " [3 4]]\n", "(2, 2) \n", "\n", "new matrix\n", "[[[1]\n", " [2]]\n", "\n", " [[3]\n", " [4]]]\n", "(2, 2, 1)\n" ] } ], "source": [ "print('old matrix')\n", "print(a)\n", "print(a.shape, '\\n') # отступим строку \n", "\n", "print('new matrix')\n", "b = a[:,:,np.newaxis]\n", "print(b)\n", "print(b.shape)" ] }, { "cell_type": "markdown", "id": "70b9a5c9", "metadata": {}, "source": [ "### Задание\n", "Создайте матрицу размерности 3х4 и продублируйте ее так, чтобы ее размерность стала 2х3х4 (понадобится newaxis и concatenate или при помощи squeeze)" ] }, { "cell_type": "code", "execution_count": null, "id": "90c9b02e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "4a2fc50b", "metadata": {}, "source": [ "### Операции над матрицами" ] }, { "cell_type": "code", "execution_count": 19, "id": "b799f6fb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 2.]\n", " [3. 4.]] \n", "\n", "[[1. 0.]\n", " [0. 1.]]\n" ] } ], "source": [ "# можно передавать тип данных, к которому требуется привести элементы матрицы, в данном случае float\n", "a = np.array([[1,2],[3,4]], float)\n", "# создадим единичную матрицу (на главной диагонали лежат единицы, все остальные элементы равны нулю)\n", "b = np.eye(2,2)\n", "\n", "print(a, '\\n')\n", "print(b)" ] }, { "cell_type": "code", "execution_count": 20, "id": "9b91dd3b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2., 2.],\n", " [3., 5.]])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a + b" ] }, { "cell_type": "code", "execution_count": 21, "id": "c173c930", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0., 2.],\n", " [3., 3.]])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a - b" ] }, { "cell_type": "markdown", "id": "aab1e5d7", "metadata": {}, "source": [ "умножение матриц через операнд * работает как поэлементное умножение" ] }, { "cell_type": "code", "execution_count": 22, "id": "a4b2adeb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 0.],\n", " [0., 4.]])" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a * b" ] }, { "cell_type": "markdown", "id": "9a1aaf78", "metadata": {}, "source": [ "матрицу можно домножать на скаляр" ] }, { "cell_type": "code", "execution_count": 23, "id": "d080e165", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[10., 20.],\n", " [30., 40.]])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a * 10" ] }, { "cell_type": "markdown", "id": "642171cc", "metadata": {}, "source": [ "и проводить все базовые математические операции со скалярами поэлементно" ] }, { "cell_type": "code", "execution_count": 24, "id": "caf040de", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0., 0.],\n", " [1., 1.]])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a // 3" ] }, { "cell_type": "markdown", "id": "7ca21353", "metadata": {}, "source": [ "Для матричного перемножения используется метод dot()" ] }, { "cell_type": "code", "execution_count": 25, "id": "97af84bd", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 2.],\n", " [3., 4.]])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# т.к. при умножение на единичную матрицу исходная матрица не меняется\n", "a.dot(b)" ] }, { "cell_type": "markdown", "id": "faf8a0f2", "metadata": {}, "source": [ "### Задание\n", "\n", "Создайте две матрицы размерности 2х3 и 3х2, перемножьте их" ] }, { "cell_type": "code", "execution_count": null, "id": "c2ad3bc9", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "71bc90d2", "metadata": {}, "source": [ "К массивам применимы те же операции, что и для списков. Их можно вызывать либо у массива, как метод, либо как функцию из numpy" ] }, { "cell_type": "code", "execution_count": 26, "id": "ef0317fc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4\n", "4\n" ] } ], "source": [ "a = np.array([[1,2],[3,4]])\n", "\n", "print(a.max())\n", "print(np.max(a))" ] }, { "cell_type": "markdown", "id": "448de690", "metadata": {}, "source": [ "### Задание\n", "\n", "Используя возможности numpy (БЕЗ SET) напишите функции, которые принимают заданный массив чисел и возвращают:\n", "\n", "1) список уникальных значений\n", "\n", "2) кортеж из среднего, максимального и минимального" ] }, { "cell_type": "code", "execution_count": 27, "id": "dff1fd72", "metadata": {}, "outputs": [], "source": [ "arr = np.array([1,2,3,4,5,6,7,8,8,8,9,2,3,4,17])\n", "\n", "def unique_values(arr):\n", " \"\"\"\n", " напишите здесь свое решение\n", " \"\"\"\n", " pass\n", "\n", "def mean_max_min(arr):\n", " \"\"\"\n", " напишите здесь свое решение\n", " \"\"\"\n", " pass" ] }, { "cell_type": "code", "execution_count": 28, "id": "01a9aee8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "None\n", "None\n" ] } ], "source": [ "# тут менять ничего не нужно, просто исполнить ячейку\n", "print(unique_values(arr))\n", "print(mean_max_min(arr))" ] }, { "cell_type": "markdown", "id": "faf2eea4", "metadata": {}, "source": [ "## Matplotlib" ] }, { "cell_type": "markdown", "id": "bc5e0963", "metadata": {}, "source": [ "Данная библиотека используется для визуального представления данных (графики, гистограммы, изображения и т.д.)" ] }, { "cell_type": "code", "execution_count": 29, "id": "2bd5b586", "metadata": {}, "outputs": [], "source": [ "# импортируем модуль pyplot из пакета matplotlib и дадим общепринятое сокращенное имя plt\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "markdown", "id": "225c8dee", "metadata": {}, "source": [ "Рассмотрим для примера отрисовку графика синусоиды" ] }, { "cell_type": "code", "execution_count": 30, "id": "1bf1604a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# зададим функцию для подсчета значения синуса в точке\n", "def func(x):\n", " return np.sin(x)\n", "\n", "# при помощи linspace создадим массив значений в 1000 шагов от минус 5Пи до плюс 5пи \n", "x = np.linspace(-np.pi * 5, np.pi * 5, 1000)\n", "\n", "# зададим параметр пропорций и размеров нашего графика (можете поменять эти значения или вообще удалить эту строчку для интереса)\n", "plt.figure(figsize=(10,3))\n", "# выведем график при помощи функции plot\n", "plt.plot(x, func(x))" ] }, { "cell_type": "code", "execution_count": 31, "id": "4ab380ee", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# в аргументы передается сначала значения оси Х (можно не задавать), а потом значения по оси Y\n", "\n", "# обратите внимание на значения оси Х, которые мы сами не задаем\n", "plt.figure(figsize=(10,3))\n", "plt.plot(func(x))" ] }, { "cell_type": "markdown", "id": "4e3a173a", "metadata": {}, "source": [ "графики можно накладывать друг на друга, менять цвета и тип линии" ] }, { "cell_type": "code", "execution_count": 32, "id": "dd0fa23a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,3))\n", "\n", "plt.plot(func(x), color='red', linestyle=':')\n", "plt.plot(func(x)*5, color='green', linestyle='--')" ] }, { "cell_type": "markdown", "id": "2e7a25d0", "metadata": {}, "source": [ "### Задание\n", "Постройте графики функции x^2+2x-1 и ее производной синего и оранжевого цвета соответственно" ] }, { "cell_type": "code", "execution_count": null, "id": "edf87e95", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "6183ff76", "metadata": {}, "source": [ "При помощи imshow можно смотреть на тепловые (высотные) карты двумерных данных. Например значения функций двух переменных или изображения." ] }, { "cell_type": "code", "execution_count": 33, "id": "fd9bad72", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQ8AAAD8CAYAAABpXiE9AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAANmklEQVR4nO3db6ie9X3H8fdnMSr+Q12wpjFV24aBE7q6EOMcI2O1aCikD2TTB1VkcFAU2qIPpII+GmyDFeYUs0ClCkX3wFbDlq6zUqp9EKeGRI3WeXSCh4SGqYt/p0v33YNzuR2O98k553df577v2PcLbu7run6/+/p9/Rk+uf5qqgpJWq7fGncBko5NhoekJoaHpCaGh6QmhoekJoaHpCbHDfPjJGcC/wCcB7wG/GlVvTWg32vAO8CvgSNVtXGYcSWN37BHHrcCj1XVBuCxbn0hf1xVv2dwSJ8Ow4bHNuC+bvk+4OtD7k/SMSLDPGGa5D+r6vQ5629V1RkD+v078BZQwN9X1Y6j7HMKmAJYxarfP4nTmuv7tPtw/cnjLmHinfD6e+MuYaL9F+/xUX2Ylt8ues0jyU+Bswc03baMcS6tqgNJzgIeTfLLqnp8UMcuWHYAnJYz6+L8yTKG+c0yffPmcZcw8b747d3jLmGiPVmPNf920fCoqq8s1JbkV0nWVtXBJGuBQwvs40D3fSjJj4BNwMDwkHRsGPaax07g2m75WuCR+R2SnJzk1I+Xga8Czw85rqQxGzY8/hK4LMnLwGXdOkk+m2RX1+czwC+S7AP+FfinqvrnIceVNGZDPedRVW8An7go0Z2mbO2WXwW+NMw4kiaPT5hKamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhq0kt4JLk8yUtJppPcOqA9Se7s2p9NclEf40oan6HDI8kq4G7gCuAC4OokF8zrdgWwoftMAfcMO66k8erjyGMTMF1Vr1bVR8CDwLZ5fbYB99es3cDpSdb2MLakMekjPNYBr89Zn+m2LbePpGPIcT3sIwO2VUOf2Y7JFLOnNpzIScNVJmnF9HHkMQOsn7N+DnCgoQ8AVbWjqjZW1cbVnNBDeZJWQh/h8RSwIcn5SY4HrgJ2zuuzE7imu+uyGThcVQd7GFvSmAx92lJVR5LcBPwEWAXcW1X7k1zftW8HdgFbgWngfeC6YceVNF59XPOgqnYxGxBzt22fs1zAjX2MJWky+ISppCaGh6QmhoekJoaHpCaGh6QmhoekJoaHpCaGh6QmhoekJoaHpCaGh6QmhoekJoaHpCaGh6QmhoekJoaHpCaGh6QmhoekJoaHpCaGh6QmhoekJoaHpCaGh6QmhoekJoaHpCaGh6QmhoekJoaHpCa9hEeSy5O8lGQ6ya0D2rckOZxkb/e5vY9xJY3PccPuIMkq4G7gMmAGeCrJzqp6YV7XJ6rqa8OOJ2ky9HHksQmYrqpXq+oj4EFgWw/7lTTBhj7yANYBr89ZnwEuHtDvkiT7gAPALVW1f9DOkkwBUwCrzjiD6Ts291Dip9Mrf7Z93CVMvC9w/bhLmGgf/s3u5t/2ceSRAdtq3voe4Nyq+hLwd8DDC+2sqnZU1caq2rjqlJN7KE/SSugjPGaA9XPWz2H26OL/VNXbVfVut7wLWJ1kTQ9jSxqTPsLjKWBDkvOTHA9cBeyc2yHJ2UnSLW/qxn2jh7EljcnQ1zyq6kiSm4CfAKuAe6tqf5Lru/btwJXADUmOAB8AV1XV/FMbSceQPi6Yfnwqsmvetu1zlu8C7upjLEmTwSdMJTUxPCQ1MTwkNTE8JDUxPCQ1MTwkNTE8JDUxPCQ1MTwkNTE8JDUxPCQ1MTwkNTE8JDUxPCQ1MTwkNTE8JDUxPCQ1MTwkNTE8JDUxPCQ1MTwkNTE8JDUxPCQ1MTwkNTE8JDUxPCQ1MTwkNeklPJLcm+RQkucXaE+SO5NMJ3k2yUV9jCtpfPo68vg+cPlR2q8ANnSfKeCensaVNCa9hEdVPQ68eZQu24D7a9Zu4PQka/sYW9J4jOqaxzrg9TnrM922T0gyleTpJE//+t33RlKcpOUbVXhkwLYa1LGqdlTVxqrauOqUk1e4LEmtRhUeM8D6OevnAAdGNLakFTCq8NgJXNPdddkMHK6qgyMaW9IKOK6PnSR5ANgCrEkyA9wBrAaoqu3ALmArMA28D1zXx7iSxqeX8KiqqxdpL+DGPsaSNBl8wlRSE8NDUhPDQ1ITw0NSE8NDUhPDQ1ITw0NSE8NDUhPDQ1ITw0NSE8NDUhPDQ1ITw0NSE8NDUhPDQ1ITw0NSE8NDUhPDQ1ITw0NSE8NDUhPDQ1ITw0NSE8NDUhPDQ1ITw0NSE8NDUhPDQ1KTXsIjyb1JDiV5foH2LUkOJ9nbfW7vY1xJ49PL/+ga+D5wF3D/Ufo8UVVf62k8SWPWy5FHVT0OvNnHviQdG/o68liKS5LsAw4At1TV/kGdkkwBUwAnchJf/PbuEZZ4bPkC14+7hInnn5+je6Pea/7tqMJjD3BuVb2bZCvwMLBhUMeq2gHsADgtZ9aI6pO0TCO521JVb1fVu93yLmB1kjWjGFvSyhhJeCQ5O0m65U3duG+MYmxJK6OX05YkDwBbgDVJZoA7gNUAVbUduBK4IckR4APgqqrylEQ6hvUSHlV19SLtdzF7K1fSp4RPmEpqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGpieEhqYnhIamJ4SGoydHgkWZ/kZ0leTLI/yTcH9EmSO5NMJ3k2yUXDjitpvPr4H10fAW6uqj1JTgWeSfJoVb0wp88VwIbuczFwT/ct6Rg19JFHVR2sqj3d8jvAi8C6ed22AffXrN3A6UnWDju2pPHp9ZpHkvOALwNPzmtaB7w+Z32GTwaMpGNIH6ctACQ5BXgI+FZVvT2/ecBPaoH9TAFTACdyUl/lSepZL0ceSVYzGxw/qKofDugyA6yfs34OcGDQvqpqR1VtrKqNqzmhj/IkrYA+7rYE+B7wYlV9d4FuO4Frursum4HDVXVw2LEljU8fpy2XAt8Ankuyt9v2HeBzAFW1HdgFbAWmgfeB63oYV9IYDR0eVfULBl/TmNungBuHHUvS5PAJU0lNDA9JTQwPSU0MD0lNDA9JTQwPSU0MD0lNDA9JTQwPSU0MD0lNDA9JTQwPSU0MD0lNDA9JTQwPSU0MD0lNDA9JTQwPSU0MD0lNDA9JTQwPSU0MD0lNDA9JTQwPSU0MD0lNDA9JTQwPSU0MD0lNhg6PJOuT/CzJi0n2J/nmgD5bkhxOsrf73D7suJLG67ge9nEEuLmq9iQ5FXgmyaNV9cK8fk9U1dd6GE/SBBj6yKOqDlbVnm75HeBFYN2w+5U02VJV/e0sOQ94HLiwqt6es30L8BAwAxwAbqmq/QvsYwqY6lYvBJ7vrcDhrQH+Y9xFzGE9i5u0miatnt+pqlNbfthbeCQ5Bfg58BdV9cN5bacB/1NV7ybZCvxtVW1Ywj6frqqNvRTYA+s5ukmrByavpk9TPb3cbUmymtkjix/MDw6Aqnq7qt7tlncBq5Os6WNsSePRx92WAN8DXqyq7y7Q5+yuH0k2deO+MezYksanj7stlwLfAJ5Lsrfb9h3gcwBVtR24ErghyRHgA+CqWtr50o4e6uuT9RzdpNUDk1fTp6aeXi+YSvrN4ROmkpoYHpKaTEx4JDkzyaNJXu6+z1ig32tJnusec396Beq4PMlLSaaT3DqgPUnu7NqfTXJR3zU01DSyx/+T3JvkUJKBz9+MaX4Wq2mkr0cs8ZWNkc3Tir1CUlUT8QH+Gri1W74V+KsF+r0GrFmhGlYBrwCfB44H9gEXzOuzFfgxEGAz8OQKz8tSatoC/OOI/j39EXAR8PwC7SOdnyXWNLL56cZbC1zULZ8K/Ns4/xwtsZ5lz9HEHHkA24D7uuX7gK+PoYZNwHRVvVpVHwEPdnXNtQ24v2btBk5PsnbMNY1MVT0OvHmULqOen6XUNFK1tFc2RjZPS6xn2SYpPD5TVQdh9h8WOGuBfgX8S5JnukfZ+7QOeH3O+gyfnOSl9Bl1TQCXJNmX5MdJfncF61nMqOdnqcYyP90rG18GnpzXNJZ5Oko9sMw56uM5jyVL8lPg7AFNty1jN5dW1YEkZwGPJvll9zdPHzJg2/x72Uvp06eljLcHOLf+//H/h4FFH/9fIaOen6UYy/x0r2w8BHyr5rzr9XHzgJ+s6DwtUs+y52ikRx5V9ZWqunDA5xHgVx8ftnXfhxbYx4Hu+xDwI2YP6/syA6yfs34Osy/yLbdPnxYdrybr8f9Rz8+ixjE/i72ywYjnaSVeIZmk05adwLXd8rXAI/M7JDk5s//NEJKcDHyVft+6fQrYkOT8JMcDV3V1za/zmu5q+Wbg8MenWytk0ZoyWY//j3p+FjXq+enGOuorG4xwnpZST9McreRV52VeEf5t4DHg5e77zG77Z4Fd3fLnmb3bsA/YD9y2AnVsZfZq9Csf7x+4Hri+Ww5wd9f+HLBxBHOzWE03dfOxD9gN/MEK1vIAcBD4b2b/9vzzCZifxWoa2fx04/0hs6cgzwJ7u8/Wcc3TEutZ9hz5eLqkJpN02iLpGGJ4SGpieEhqYnhIamJ4SGpieEhqYnhIavK/Iv79kctBHkIAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "arr = np.array([[0,1,0],[1,2,1],[0,1,0]])\n", "plt.imshow(arr)" ] }, { "cell_type": "markdown", "id": "420a9abe", "metadata": {}, "source": [ "При помощи hist можно отобразить сравнительное графическое представление данных в виде гистограммы.\n", "\n", "Указывая количество бинов, мы можем регулировать промежутки на которых будут формироваться столбцы" ] }, { "cell_type": "code", "execution_count": 34, "id": "f2c4d922", "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD4CAYAAADiry33AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAQ9UlEQVR4nO3dXYxcZ33H8e+vxlYLRFDwJiC/4Fz4goDiEK1MUFDjoBI5vNRFopItCggRWUGJBBWlMlwQld5QIaEKErAssAIqSVQJAlYxJBGlDSUK9ToNSUwSujJps3KEDaHhVY1M/73YYzrZzHqOvbO7zrPfjzSac56XM8954vz27LNnZlJVSJLa9XvLPQBJ0uIy6CWpcQa9JDXOoJekxhn0ktS45y33AIZZu3Ztbdq0abmHIUnPGYcPH/5JVU0Mqzsng37Tpk1MTU0t9zAk6TkjyX/OV+fSjSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWrcyKBPsiHJt5M8nORIkvcPaZMkn0oyneSBJJcO1G1P8mhXt2fcJyBJOr0+V/QngQ9W1SuBy4Drklw0p83VwObusRv4LECSVcBNXf1FwK4hfSVJi2hk0FfVE1V1X7f9C+BhYN2cZjuAL9ase4EXJ3k5sBWYrqqjVfU0cFvXVpK0RM7onbFJNgGvAb43p2od8PjA/kxXNqz8tfMcezezvw2wcePGMxnWM2za8/Wz7rsQj338zcvyupI0Su8/xiZ5IfBl4ANV9fO51UO61GnKn11Yta+qJqtqcmJi6Mc1SJLOQq8r+iSrmQ35L1XVV4Y0mQE2DOyvB44Ba+YplyQtkT533QT4PPBwVX1ynmYHgHd1d99cBjxVVU8Ah4DNSS5MsgbY2bWVJC2RPlf0lwPvBB5Mcn9X9hFgI0BV7QUOAm8CpoFfA+/p6k4muR64A1gF7K+qI+M8AUnS6Y0M+qr6V4avtQ+2KeC6eeoOMvuDQJK0DHxnrCQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcSO/eCTJfuAtwPGqevWQ+g8B7xg43iuBiap6MsljwC+A3wInq2pyXAOXJPXT54r+ZmD7fJVV9YmquqSqLgE+DPxLVT050OTKrt6Ql6RlMDLoq+pu4MlR7Tq7gFsXNCJJ0liNbY0+yfOZvfL/8kBxAXcmOZxk97heS5LU38g1+jPwVuC7c5ZtLq+qY0nOB+5K8kj3G8KzdD8IdgNs3LhxjMOSpJVtnHfd7GTOsk1VHeuejwO3A1vn61xV+6pqsqomJyYmxjgsSVrZxhL0SV4EXAF8baDsBUnOO7UNXAU8NI7XkyT11+f2yluBbcDaJDPADcBqgKra2zV7G3BnVf1qoOsFwO1JTr3OLVX1zfENXZLUx8igr6pdPdrczOxtmINlR4EtZzswSdJ4+M5YSWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJatzIoE+yP8nxJEO/7zXJtiRPJbm/e3x0oG57kkeTTCfZM86BS5L66XNFfzOwfUSb71TVJd3jYwBJVgE3AVcDFwG7kly0kMFKks7cyKCvqruBJ8/i2FuB6ao6WlVPA7cBO87iOJKkBRjXGv3rknw/yTeSvKorWwc8PtBmpisbKsnuJFNJpk6cODGmYUmSxhH09wGvqKotwKeBr3blGdK25jtIVe2rqsmqmpyYmBjDsCRJMIagr6qfV9Uvu+2DwOoka5m9gt8w0HQ9cGyhrydJOjMLDvokL0uSbntrd8yfAoeAzUkuTLIG2AkcWOjrSZLOzPNGNUhyK7ANWJtkBrgBWA1QVXuBtwPvS3IS+A2ws6oKOJnkeuAOYBWwv6qOLMpZSJLmNTLoq2rXiPobgRvnqTsIHDy7oUmSxsF3xkpS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjRgZ9kv1Jjid5aJ76dyR5oHvck2TLQN1jSR5Mcn+SqXEOXJLUT58r+puB7aep/xFwRVVdDPwNsG9O/ZVVdUlVTZ7dECVJC9HnO2PvTrLpNPX3DOzeC6wfw7gkSWMy7jX69wLfGNgv4M4kh5PsPl3HJLuTTCWZOnHixJiHJUkr18gr+r6SXMls0L9+oPjyqjqW5HzgriSPVNXdw/pX1T66ZZ/Jycka17gkaaUbyxV9kouBzwE7quqnp8qr6lj3fBy4Hdg6jteTJPW34KBPshH4CvDOqvrhQPkLkpx3ahu4Chh6544kafGMXLpJciuwDVibZAa4AVgNUFV7gY8CLwU+kwTgZHeHzQXA7V3Z84Bbquqbi3AOkqTT6HPXza4R9dcA1wwpPwpseXYPSdJS8p2xktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1LiRQZ9kf5LjSYZ+32tmfSrJdJIHklw6ULc9yaNd3Z5xDlyS1E+fK/qbge2nqb8a2Nw9dgOfBUiyCripq78I2JXkooUMVpJ05kYGfVXdDTx5miY7gC/WrHuBFyd5ObAVmK6qo1X1NHBb11aStIRGfjl4D+uAxwf2Z7qyYeWvne8gSXYz+xsBGzduHMOwltamPV9fttd+7ONvXpbX9ZyXznKd70rV2n/ncfwxNkPK6jTlQ1XVvqqarKrJiYmJMQxLkgTjuaKfATYM7K8HjgFr5imXJC2hcVzRHwDe1d19cxnwVFU9ARwCNie5MMkaYGfXVpK0hEZe0Se5FdgGrE0yA9wArAaoqr3AQeBNwDTwa+A9Xd3JJNcDdwCrgP1VdWQRzkGSdBojg76qdo2oL+C6eeoOMvuDQJK0THxnrCQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDWuV9An2Z7k0STTSfYMqf9Qkvu7x0NJfpvkJV3dY0ke7Oqmxn0CkqTT6/OdsauAm4A3AjPAoSQHquoHp9pU1SeAT3Tt3wr8RVU9OXCYK6vqJ2MduSSplz5X9FuB6ao6WlVPA7cBO07Tfhdw6zgGJ0lauD5Bvw54fGB/pit7liTPB7YDXx4oLuDOJIeT7J7vRZLsTjKVZOrEiRM9hiVJ6qNP0GdIWc3T9q3Ad+cs21xeVZcCVwPXJfmjYR2ral9VTVbV5MTERI9hSZL66BP0M8CGgf31wLF52u5kzrJNVR3rno8DtzO7FCRJWiJ9gv4QsDnJhUnWMBvmB+Y2SvIi4ArgawNlL0hy3qlt4CrgoXEMXJLUz8i7bqrqZJLrgTuAVcD+qjqS5Nqufm/X9G3AnVX1q4HuFwC3Jzn1WrdU1TfHeQKSpNMbGfQAVXUQODinbO+c/ZuBm+eUHQW2LGiEkqQF8Z2xktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS1LheQZ9ke5JHk0wn2TOkfluSp5Lc3z0+2revJGlxjfwqwSSrgJuANwIzwKEkB6rqB3Oafqeq3nKWfSVJi6TPFf1WYLqqjlbV08BtwI6ex19IX0nSGPQJ+nXA4wP7M13ZXK9L8v0k30jyqjPsS5LdSaaSTJ04caLHsCRJffQJ+gwpqzn79wGvqKotwKeBr55B39nCqn1VNVlVkxMTEz2GJUnqo0/QzwAbBvbXA8cGG1TVz6vql932QWB1krV9+kqSFlefoD8EbE5yYZI1wE7gwGCDJC9Lkm57a3fcn/bpK0laXCPvuqmqk0muB+4AVgH7q+pIkmu7+r3A24H3JTkJ/AbYWVUFDO27SOciSRpiZNDD75ZjDs4p2zuwfSNwY9++kqSl4ztjJalxBr0kNc6gl6TGGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXG9gj7J9iSPJplOsmdI/TuSPNA97kmyZaDusSQPJrk/ydQ4By9JGm3kVwkmWQXcBLwRmAEOJTlQVT8YaPYj4Iqq+lmSq4F9wGsH6q+sqp+McdySpJ76XNFvBaar6mhVPQ3cBuwYbFBV91TVz7rde4H14x2mJOls9Qn6dcDjA/szXdl83gt8Y2C/gDuTHE6ye75OSXYnmUoydeLEiR7DkiT1MXLpBsiQshraMLmS2aB//UDx5VV1LMn5wF1JHqmqu591wKp9zC75MDk5OfT4kqQz1+eKfgbYMLC/Hjg2t1GSi4HPATuq6qenyqvqWPd8HLid2aUgSdIS6RP0h4DNSS5MsgbYCRwYbJBkI/AV4J1V9cOB8hckOe/UNnAV8NC4Bi9JGm3k0k1VnUxyPXAHsArYX1VHklzb1e8FPgq8FPhMEoCTVTUJXADc3pU9D7ilqr65KGciSRqqzxo9VXUQODinbO/A9jXANUP6HQW2zC2XJC0d3xkrSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjesV9Em2J3k0yXSSPUPqk+RTXf0DSS7t21eStLhGBn2SVcBNwNXARcCuJBfNaXY1sLl77AY+ewZ9JUmLqM8V/VZguqqOVtXTwG3AjjltdgBfrFn3Ai9O8vKefSVJi6jPl4OvAx4f2J8BXtujzbqefQFIspvZ3wYAfpnk0R5ja8Fa4CcLOUD+dkwjOff9bq5W0DkDZ3y+C/43tUKcc/O0wH/Xr5ivok/QZ0hZ9WzTp+9sYdU+YF+P8TQlyVRVTS73OJ4LnKt+nKd+VtI89Qn6GWDDwP564FjPNmt69JUkLaI+a/SHgM1JLkyyBtgJHJjT5gDwru7um8uAp6rqiZ59JUmLaOQVfVWdTHI9cAewCthfVUeSXNvV7wUOAm8CpoFfA+85Xd9FOZPnrhW3XLUAzlU/zlM/K2aeUjV0yVyS1AjfGStJjTPoJalxBv0SSrI/yfEkDw2UvSTJXUn+o3v+w+Uc47kgyYYk307ycJIjSd7flTtXA5L8fpJ/S/L9bp7+uit3noZIsirJvyf5x25/xcyTQb+0bga2zynbA3yrqjYD3+r2V7qTwAer6pXAZcB13UdnOFfP9D/AG6pqC3AJsL276815Gu79wMMD+ytmngz6JVRVdwNPzineAXyh2/4C8KdLOaZzUVU9UVX3ddu/YPZ/znU4V8/QfeTIL7vd1d2jcJ6eJcl64M3A5waKV8w8GfTL74LuPQd0z+cv83jOKUk2Aa8Bvodz9SzdcsT9wHHgrqpynob7O+CvgP8dKFsx82TQ65yV5IXAl4EPVNXPl3s856Kq+m1VXcLsu863Jnn1Mg/pnJPkLcDxqjq83GNZLgb98vtx90mfdM/Hl3k854Qkq5kN+S9V1Ve6YudqHlX138A/M/s3IOfpmS4H/iTJY8x+gu4bkvw9K2ieDPrldwB4d7f9buBryziWc0KSAJ8HHq6qTw5UOVcDkkwkeXG3/QfAHwOP4Dw9Q1V9uKrWV9UmZj+G5Z+q6s9ZQfPkO2OXUJJbgW3Mfjzqj4EbgK8C/wBsBP4L+LOqmvsH2xUlyeuB7wAP8v9rqh9hdp3eueokuZjZPyKuYvai7R+q6mNJXorzNFSSbcBfVtVbVtI8GfSS1DiXbiSpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJatz/AXhVSugcd0FYAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "data = np.array([10,4,26,17,8,45])\n", "\n", "plt.hist(data, bins=10)\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "913b220c", "metadata": {}, "source": [ "## Pandas" ] }, { "cell_type": "markdown", "id": "af08b78f", "metadata": {}, "source": [ "Pandas позволяет обрабатывать и анализировать данные. Можно провести аналогию с Excel.\n", "В данном разделе рассмотрим основные возможности (т.к. библиотека крайне объемная и всю документацию будет физически невозможно сюда уместить)" ] }, { "cell_type": "code", "execution_count": 35, "id": "7ebddb89", "metadata": {}, "outputs": [], "source": [ "# Импортируем библотеку pandas и дадим ей сокращенное общепринятое имя pd \n", "import pandas as pd" ] }, { "cell_type": "markdown", "id": "e4199517", "metadata": {}, "source": [ "Pandas позволяет считывать табличные данные из файлов в форматах .csv, .xml, .xlsx, .json и т.д.\n", "\n", "Скачаем для примера классический датасет с информацией о пассажирах титаника.\n", "\n", "Линк на датасет: https://github.com/datasciencedojo/datasets/blob/master/titanic.csv" ] }, { "cell_type": "code", "execution_count": 36, "id": "bbf38d57", "metadata": {}, "outputs": [], "source": [ "\"\"\"\n", "Загрузим данный датасет. \n", "т.к. он в формате .csv то воспользуемся функцией read_csv() \n", "в которой укажем относительный путь к этому файлу \n", "\"\"\"\n", "df = pd.read_csv('lab_3_titanic.csv')" ] }, { "cell_type": "code", "execution_count": 37, "id": "268d6ec7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS
5603Moran, Mr. JamesmaleNaN003308778.4583NaNQ
6701McCarthy, Mr. Timothy Jmale54.0001746351.8625E46S
7803Palsson, Master. Gosta Leonardmale2.03134990921.0750NaNS
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.00234774211.1333NaNS
91012Nasser, Mrs. Nicholas (Adele Achem)female14.01023773630.0708NaNC
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass \\\n", "0 1 0 3 \n", "1 2 1 1 \n", "2 3 1 3 \n", "3 4 1 1 \n", "4 5 0 3 \n", "5 6 0 3 \n", "6 7 0 1 \n", "7 8 0 3 \n", "8 9 1 3 \n", "9 10 1 2 \n", "\n", " Name Sex Age SibSp \\\n", "0 Braund, Mr. Owen Harris male 22.0 1 \n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n", "2 Heikkinen, Miss. Laina female 26.0 0 \n", "3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n", "4 Allen, Mr. William Henry male 35.0 0 \n", "5 Moran, Mr. James male NaN 0 \n", "6 McCarthy, Mr. Timothy J male 54.0 0 \n", "7 Palsson, Master. Gosta Leonard male 2.0 3 \n", "8 Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) female 27.0 0 \n", "9 Nasser, Mrs. Nicholas (Adele Achem) female 14.0 1 \n", "\n", " Parch Ticket Fare Cabin Embarked \n", "0 0 A/5 21171 7.2500 NaN S \n", "1 0 PC 17599 71.2833 C85 C \n", "2 0 STON/O2. 3101282 7.9250 NaN S \n", "3 0 113803 53.1000 C123 S \n", "4 0 373450 8.0500 NaN S \n", "5 0 330877 8.4583 NaN Q \n", "6 0 17463 51.8625 E46 S \n", "7 1 349909 21.0750 NaN S \n", "8 2 347742 11.1333 NaN S \n", "9 0 237736 30.0708 NaN C " ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# посмотрим первые 10 записей\n", "df[:10]" ] }, { "cell_type": "code", "execution_count": 38, "id": "044a069c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS
5603Moran, Mr. JamesmaleNaN003308778.4583NaNQ
6701McCarthy, Mr. Timothy Jmale54.0001746351.8625E46S
7803Palsson, Master. Gosta Leonardmale2.03134990921.0750NaNS
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.00234774211.1333NaNS
91012Nasser, Mrs. Nicholas (Adele Achem)female14.01023773630.0708NaNC
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass \\\n", "0 1 0 3 \n", "1 2 1 1 \n", "2 3 1 3 \n", "3 4 1 1 \n", "4 5 0 3 \n", "5 6 0 3 \n", "6 7 0 1 \n", "7 8 0 3 \n", "8 9 1 3 \n", "9 10 1 2 \n", "\n", " Name Sex Age SibSp \\\n", "0 Braund, Mr. Owen Harris male 22.0 1 \n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n", "2 Heikkinen, Miss. Laina female 26.0 0 \n", "3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n", "4 Allen, Mr. William Henry male 35.0 0 \n", "5 Moran, Mr. James male NaN 0 \n", "6 McCarthy, Mr. Timothy J male 54.0 0 \n", "7 Palsson, Master. Gosta Leonard male 2.0 3 \n", "8 Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) female 27.0 0 \n", "9 Nasser, Mrs. Nicholas (Adele Achem) female 14.0 1 \n", "\n", " Parch Ticket Fare Cabin Embarked \n", "0 0 A/5 21171 7.2500 NaN S \n", "1 0 PC 17599 71.2833 C85 C \n", "2 0 STON/O2. 3101282 7.9250 NaN S \n", "3 0 113803 53.1000 C123 S \n", "4 0 373450 8.0500 NaN S \n", "5 0 330877 8.4583 NaN Q \n", "6 0 17463 51.8625 E46 S \n", "7 1 349909 21.0750 NaN S \n", "8 2 347742 11.1333 NaN S \n", "9 0 237736 30.0708 NaN C " ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# то же самое можно сделать и так\n", "df.head(10)" ] }, { "cell_type": "markdown", "id": "09d489ac", "metadata": {}, "source": [ "### Задание\n", "Выведите последние 10 строк таблицы" ] }, { "cell_type": "code", "execution_count": null, "id": "beb0df55", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "ca5614f8", "metadata": {}, "source": [ "Как видим, в таблице имеются столбцы, означающие некоторые признаки:\n", "- PassengerId: уникальный идентификатор пассажира в данном наборе данных\n", "- Survived: 0 - погиб, 1 - выжил\n", "- Pclass: класс обслуживания пассажира\n", "- Name: ФИО пассажира (как был записан в документах)\n", "- Sex: пол пассажира\n", "- Age: возраст пассажира\n", "- SibSp: сколько братьев/сестер или супругов на борту\n", "- Parch: сколько детей/родителей на борту\n", "- Ticket: номер билета\n", "- Fare: пассажирский тариф\n", "- Cabin: номер каюты\n", "- Embarked: порт погрузки. C - Cherbourg, Q - Queenstown, S - Southampton" ] }, { "cell_type": "code", "execution_count": 39, "id": "14180f18", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['PassengerId',\n", " 'Survived',\n", " 'Pclass',\n", " 'Name',\n", " 'Sex',\n", " 'Age',\n", " 'SibSp',\n", " 'Parch',\n", " 'Ticket',\n", " 'Fare',\n", " 'Cabin',\n", " 'Embarked']" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Получить список столбцов можно при помощи следующей функции\n", "df.columns.to_list()" ] }, { "cell_type": "markdown", "id": "0701d6db", "metadata": {}, "source": [ "Краткую статистику по всем данным можно получить при помощи метода describe().\n", "\n", "Она будет посчитана только для тех признаков, которые представлены численно." ] }, { "cell_type": "code", "execution_count": 40, "id": "d0cde577", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassAgeSibSpParchFare
count891.000000891.000000891.000000714.000000891.000000891.000000891.000000
mean446.0000000.3838382.30864229.6991180.5230080.38159432.204208
std257.3538420.4865920.83607114.5264971.1027430.80605749.693429
min1.0000000.0000001.0000000.4200000.0000000.0000000.000000
25%223.5000000.0000002.00000020.1250000.0000000.0000007.910400
50%446.0000000.0000003.00000028.0000000.0000000.00000014.454200
75%668.5000001.0000003.00000038.0000001.0000000.00000031.000000
max891.0000001.0000003.00000080.0000008.0000006.000000512.329200
\n", "
" ], "text/plain": [ " PassengerId Survived Pclass Age SibSp \\\n", "count 891.000000 891.000000 891.000000 714.000000 891.000000 \n", "mean 446.000000 0.383838 2.308642 29.699118 0.523008 \n", "std 257.353842 0.486592 0.836071 14.526497 1.102743 \n", "min 1.000000 0.000000 1.000000 0.420000 0.000000 \n", "25% 223.500000 0.000000 2.000000 20.125000 0.000000 \n", "50% 446.000000 0.000000 3.000000 28.000000 0.000000 \n", "75% 668.500000 1.000000 3.000000 38.000000 1.000000 \n", "max 891.000000 1.000000 3.000000 80.000000 8.000000 \n", "\n", " Parch Fare \n", "count 891.000000 891.000000 \n", "mean 0.381594 32.204208 \n", "std 0.806057 49.693429 \n", "min 0.000000 0.000000 \n", "25% 0.000000 7.910400 \n", "50% 0.000000 14.454200 \n", "75% 0.000000 31.000000 \n", "max 6.000000 512.329200 " ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe()" ] }, { "cell_type": "markdown", "id": "e9c1c6b5", "metadata": {}, "source": [ "Информацию по типам данных и количеству пропусков можно посмотреть при помощи info" ] }, { "cell_type": "code", "execution_count": 41, "id": "964c9588", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 891 entries, 0 to 890\n", "Data columns (total 12 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 PassengerId 891 non-null int64 \n", " 1 Survived 891 non-null int64 \n", " 2 Pclass 891 non-null int64 \n", " 3 Name 891 non-null object \n", " 4 Sex 891 non-null object \n", " 5 Age 714 non-null float64\n", " 6 SibSp 891 non-null int64 \n", " 7 Parch 891 non-null int64 \n", " 8 Ticket 891 non-null object \n", " 9 Fare 891 non-null float64\n", " 10 Cabin 204 non-null object \n", " 11 Embarked 889 non-null object \n", "dtypes: float64(2), int64(5), object(5)\n", "memory usage: 83.7+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "markdown", "id": "13a039cc", "metadata": {}, "source": [ "Можно применять различные встроенные методы к определенным признакам, обращаясь двумя равнозначными способами:" ] }, { "cell_type": "code", "execution_count": 42, "id": "375462f9", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.3838383838383838" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.Survived.mean()" ] }, { "cell_type": "code", "execution_count": 43, "id": "a7fbc32b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.3838383838383838" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Survived'].mean()" ] }, { "cell_type": "markdown", "id": "b33c70db", "metadata": {}, "source": [ "### Задание\n", "Выведите максимальный и минимальный возраст пассажиров" ] }, { "cell_type": "code", "execution_count": null, "id": "5ef929a7", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "bd46e44e", "metadata": {}, "source": [ "### Задание\n", "Воспользуйтесь встроенным методом value_counts для признака Pclass." ] }, { "cell_type": "code", "execution_count": null, "id": "b38010e0", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "858da916", "metadata": {}, "source": [ "Для получения конкретной строки можно использовать индексацию:" ] }, { "cell_type": "code", "execution_count": 44, "id": "890e3a69", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PassengerId 6\n", "Survived 0\n", "Pclass 3\n", "Name Moran, Mr. James\n", "Sex male\n", "Age NaN\n", "SibSp 0\n", "Parch 0\n", "Ticket 330877\n", "Fare 8.4583\n", "Cabin NaN\n", "Embarked Q\n", "Name: 5, dtype: object" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[5]" ] }, { "cell_type": "markdown", "id": "6d365deb", "metadata": {}, "source": [ "### Задание\n", "Выведите номер билета у 120-го пассажира " ] }, { "cell_type": "code", "execution_count": null, "id": "5c8bea19", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "d024a508", "metadata": {}, "source": [ "Возможно применение фильтрации для данных по какому-либо условию:" ] }, { "cell_type": "code", "execution_count": 45, "id": "a9f9500d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
6701McCarthy, Mr. Timothy Jmale54.0001746351.8625E46S
111211Bonnell, Miss. Elizabethfemale58.00011378326.5500C103S
151612Hewlett, Mrs. (Mary D Kingcome)female55.00024870616.0000NaNS
333402Wheadon, Mr. Edward Hmale66.000C.A. 2457910.5000NaNS
545501Ostby, Mr. Engelhart Corneliusmale65.00111350961.9792B30C
.......................................
82082111Hays, Mrs. Charles Melville (Clara Jennings Gr...female52.0111274993.5000B69S
82983011Stone, Mrs. George Nelson (Martha Evelyn)female62.00011357280.0000B28NaN
85185203Svensson, Mr. Johanmale74.0003470607.7750NaNS
85785811Daly, Mr. Peter Denismale51.00011305526.5500E17S
87988011Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)female56.0011176783.1583C50C
\n", "

64 rows × 12 columns

\n", "
" ], "text/plain": [ " PassengerId Survived Pclass \\\n", "6 7 0 1 \n", "11 12 1 1 \n", "15 16 1 2 \n", "33 34 0 2 \n", "54 55 0 1 \n", ".. ... ... ... \n", "820 821 1 1 \n", "829 830 1 1 \n", "851 852 0 3 \n", "857 858 1 1 \n", "879 880 1 1 \n", "\n", " Name Sex Age SibSp \\\n", "6 McCarthy, Mr. Timothy J male 54.0 0 \n", "11 Bonnell, Miss. Elizabeth female 58.0 0 \n", "15 Hewlett, Mrs. (Mary D Kingcome) female 55.0 0 \n", "33 Wheadon, Mr. Edward H male 66.0 0 \n", "54 Ostby, Mr. Engelhart Cornelius male 65.0 0 \n", ".. ... ... ... ... \n", "820 Hays, Mrs. Charles Melville (Clara Jennings Gr... female 52.0 1 \n", "829 Stone, Mrs. George Nelson (Martha Evelyn) female 62.0 0 \n", "851 Svensson, Mr. Johan male 74.0 0 \n", "857 Daly, Mr. Peter Denis male 51.0 0 \n", "879 Potter, Mrs. Thomas Jr (Lily Alexenia Wilson) female 56.0 0 \n", "\n", " Parch Ticket Fare Cabin Embarked \n", "6 0 17463 51.8625 E46 S \n", "11 0 113783 26.5500 C103 S \n", "15 0 248706 16.0000 NaN S \n", "33 0 C.A. 24579 10.5000 NaN S \n", "54 1 113509 61.9792 B30 C \n", ".. ... ... ... ... ... \n", "820 1 12749 93.5000 B69 S \n", "829 0 113572 80.0000 B28 NaN \n", "851 0 347060 7.7750 NaN S \n", "857 0 113055 26.5500 E17 S \n", "879 1 11767 83.1583 C50 C \n", "\n", "[64 rows x 12 columns]" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# посмотрим все данные пассажиров, которые старше 50\n", "df[df.Age > 50]" ] }, { "cell_type": "markdown", "id": "fd091601", "metadata": {}, "source": [ "### Задание\n", "Выведите имена всех мужчин в виде списка" ] }, { "cell_type": "code", "execution_count": null, "id": "b52fd60b", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "c2a8a119", "metadata": {}, "source": [ "Можно использовать встроенные графики для отображения данных" ] }, { "cell_type": "code", "execution_count": 46, "id": "d1066e4c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD5CAYAAADcDXXiAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAATvElEQVR4nO3df4xV533n8fcnOHFS0wS8jkcsWIVIKFt7rST1iG1kKRpKW7N1VPyPV0RuRCqv2D+83qzWUgX9Y6v+geRdyVUry14JmbRUpplFcSxQYiWxqGe7kVyTkDgl2GFNY9ZMIdDGP9rJRo5wv/vHHHZv8QxzuXOHmXn0fkmje85zn/Oc53sRnzk8995DqgpJUlves9gTkCQNn+EuSQ0y3CWpQYa7JDXIcJekBhnuktSg6+bqkOSjwH/vafoI8J+BP+3a1wOngX9TVW90x+wG7gfeAf5DVX39Sue46aabav369Vc/+85PfvITbrjhhoGPXypaqQOsZSlqpQ6wlkuOHTv2d1X14RmfrKq+f4AVwI+AXwD+K7Cra98F/Jdu+1bge8D1wAbgr4EVVxr3jjvuqPl47rnn5nX8UtFKHVXWshS1UkeVtVwCfLtmydWrXZbZAvx1Vf1vYBuwv2vfD9zTbW8Dxqvq7ap6FTgFbLrK80iS5uFqw3078MVue6SqzgF0jzd37WuBMz3HTHZtkqRrJNXn7QeSvA84C9xWVeeTvFlVq3qef6OqVid5DHi+qp7s2vcBz1TVU5eNtxPYCTAyMnLH+Pj4wEVMTU2xcuXKgY9fKlqpA6xlKWqlDrCWSzZv3nysqkZnfHK29ZrLf5hebvlGz/5JYE23vQY42W3vBnb39Ps68Mkrje2a+7RW6qiylqWolTqqrOUShrTm/hn+/5IMwGFgR7e9AzjU0749yfVJNgAbgaNXcR5J0jzN+VFIgCQ/B/wa8O96mh8GDia5H3gNuBegqk4kOQi8BFwEHqiqd4Y6a0nSFfUV7lX1f4B/dlnbj5n+9MxM/fcAe+Y9O0nSQPyGqiQ1yHCXpAb1tSyz1B3/m7f43K6vXvPznn747mt+Tknqh1fuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoP6Cvckq5J8KckPkryc5JNJbkzybJJXusfVPf13JzmV5GSSuxZu+pKkmfR75f5HwNeq6l8AHwNeBnYBR6pqI3Ck2yfJrcB24DZgK/B4khXDnrgkaXZzhnuSDwKfAvYBVNXPqupNYBuwv+u2H7in294GjFfV21X1KnAK2DTcaUuSrqSfK/ePAH8L/HGS7yZ5IskNwEhVnQPoHm/u+q8FzvQcP9m1SZKukVTVlTsko8BfAndW1QtJ/gj4e+DBqlrV0++Nqlqd5DHg+ap6smvfBzxTVU9dNu5OYCfAyMjIHePj4wMXceH1tzj/04EPH9jtaz801PGmpqZYuXLlUMdcLNay9LRSB1jLJZs3bz5WVaMzPXddH8dPApNV9UK3/yWm19fPJ1lTVeeSrAEu9PS/pef4dcDZywetqr3AXoDR0dEaGxvrp5YZPXrgEI8c76eU4Tp939hQx5uYmGA+r8NSYi1LTyt1gLX0Y85lmar6EXAmyUe7pi3AS8BhYEfXtgM41G0fBrYnuT7JBmAjcHSos5YkXVG/l7sPAgeSvA/4IfDbTP9iOJjkfuA14F6AqjqR5CDTvwAuAg9U1TtDn7kkaVZ9hXtVvQjMtK6zZZb+e4A9g09LkjQffkNVkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUF9hXuS00mOJ3kxybe7thuTPJvkle5xdU//3UlOJTmZ5K6FmrwkaWZXc+W+uao+XlWj3f4u4EhVbQSOdPskuRXYDtwGbAUeT7JiiHOWJM1hPssy24D93fZ+4J6e9vGqeruqXgVOAZvmcR5J0lXqN9wL+EaSY0l2dm0jVXUOoHu8uWtfC5zpOXaya5MkXSOpqrk7Jf+8qs4muRl4FngQOFxVq3r6vFFVq5M8BjxfVU927fuAZ6rqqcvG3AnsBBgZGbljfHx84CIuvP4W53868OEDu33th4Y63tTUFCtXrhzqmIvFWpaeVuoAa7lk8+bNx3qWyv+J6/oZoKrOdo8XkjzN9DLL+SRrqupckjXAha77JHBLz+HrgLMzjLkX2AswOjpaY2NjfZbzbo8eOMQjx/sqZahO3zc21PEmJiaYz+uwlFjL0tNKHWAt/ZhzWSbJDUl+/tI28OvA94HDwI6u2w7gULd9GNie5PokG4CNwNFhT1ySNLt+LndHgKeTXOr/Z1X1tSTfAg4muR94DbgXoKpOJDkIvARcBB6oqncWZPaSpBnNGe5V9UPgYzO0/xjYMssxe4A9856dJGkgfkNVkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1qO9wT7IiyXeTfKXbvzHJs0le6R5X9/TdneRUkpNJ7lqIiUuSZnc1V+6fB17u2d8FHKmqjcCRbp8ktwLbgduArcDjSVYMZ7qSpH70Fe5J1gF3A0/0NG8D9nfb+4F7etrHq+rtqnoVOAVsGspsJUl96ffK/Q+B3wH+sadtpKrOAXSPN3fta4EzPf0muzZJ0jVy3VwdknwauFBVx5KM9TFmZmirGcbdCewEGBkZYWJioo+hZzbyAXjo9osDHz+o+cx5JlNTU0Mfc7FYy9LTSh1gLf2YM9yBO4HfTPIbwPuBDyZ5EjifZE1VnUuyBrjQ9Z8Ebuk5fh1w9vJBq2ovsBdgdHS0xsbGBi7i0QOHeOR4P6UM1+n7xoY63sTEBPN5HZYSa1l6WqkDrKUfcy7LVNXuqlpXVeuZfqP0z6vqt4DDwI6u2w7gULd9GNie5PokG4CNwNGhz1ySNKv5XO4+DBxMcj/wGnAvQFWdSHIQeAm4CDxQVe/Me6aSpL5dVbhX1QQw0W3/GNgyS789wJ55zk2SNCC/oSpJDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUoDnDPcn7kxxN8r0kJ5L8ftd+Y5Jnk7zSPa7uOWZ3klNJTia5ayELkCS9Wz9X7m8Dv1JVHwM+DmxN8svALuBIVW0EjnT7JLkV2A7cBmwFHk+yYgHmLkmaxZzhXtOmut33dj8FbAP2d+37gXu67W3AeFW9XVWvAqeATcOctCTpyvpac0+yIsmLwAXg2ap6ARipqnMA3ePNXfe1wJmewye7NknSNZKq6r9zsgp4GngQ+GZVrep57o2qWp3kMeD5qnqya98HPFNVT1021k5gJ8DIyMgd4+PjAxdx4fW3OP/TgQ8f2O1rPzTU8aampli5cuVQx1ws1rL0tFIHWMslmzdvPlZVozM9d93VDFRVbyaZYHot/XySNVV1Lskapq/qYfpK/Zaew9YBZ2cYay+wF2B0dLTGxsauZir/xKMHDvHI8asqZShO3zc21PEmJiaYz+uwlFjL0tNKHWAt/ejn0zIf7q7YSfIB4FeBHwCHgR1dtx3AoW77MLA9yfVJNgAbgaNDnrck6Qr6udxdA+zvPvHyHuBgVX0lyfPAwST3A68B9wJU1YkkB4GXgIvAA1X1zsJMX5I0kznDvar+CvjEDO0/BrbMcsweYM+8ZydJGojfUJWkBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ2aM9yT3JLkuSQvJzmR5PNd+41Jnk3ySve4uueY3UlOJTmZ5K6FLECS9G79XLlfBB6qql8Efhl4IMmtwC7gSFVtBI50+3TPbQduA7YCjydZsRCTlyTNbM5wr6pzVfWdbvsfgJeBtcA2YH/XbT9wT7e9DRivqrer6lXgFLBpyPOWJF3BVa25J1kPfAJ4ARipqnMw/QsAuLnrthY403PYZNcmSbpGUlX9dUxWAv8D2FNVX07yZlWt6nn+japaneQx4PmqerJr3wc8U1VPXTbeTmAnwMjIyB3j4+MDF3Hh9bc4/9OBDx/Y7Ws/NNTxpqamWLly5VDHXCzWsvS0UgdYyyWbN28+VlWjMz13XT8DJHkv8BRwoKq+3DWfT7Kmqs4lWQNc6NongVt6Dl8HnL18zKraC+wFGB0drbGxsX6mMqNHDxzikeN9lTJUp+8bG+p4ExMTzOd1WEqsZelppQ6wln7082mZAPuAl6vqD3qeOgzs6LZ3AId62rcnuT7JBmAjcHR4U5YkzaWfy907gc8Cx5O82LX9LvAwcDDJ/cBrwL0AVXUiyUHgJaY/afNAVb0z7IlLkmY3Z7hX1TeBzPL0llmO2QPsmce8JEnz4DdUJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUoDn/g2zNbv2urw51vIduv8jn+hjz9MN3D/W8ktoz55V7ki8kuZDk+z1tNyZ5Nskr3ePqnud2JzmV5GSSuxZq4pKk2fWzLPMnwNbL2nYBR6pqI3Ck2yfJrcB24LbumMeTrBjabCVJfZkz3KvqL4DXL2veBuzvtvcD9/S0j1fV21X1KnAK2DScqUqS+jXoG6ojVXUOoHu8uWtfC5zp6TfZtUmSrqFhv6GaGdpqxo7JTmAnwMjICBMTEwOfdOQD029GLnf91jGf1+pamZqaWhbz7EcrtbRSB1hLPwYN9/NJ1lTVuSRrgAtd+yRwS0+/dcDZmQaoqr3AXoDR0dEaGxsbcCrw6IFDPHJ8+X/w56HbL/ZVx+n7xhZ+MvM0MTHBfP5Ml5JWammlDrCWfgy6LHMY2NFt7wAO9bRvT3J9kg3ARuDo/KYoSbpac14mJvkiMAbclGQS+D3gYeBgkvuB14B7AarqRJKDwEvAReCBqnpngeYuSZrFnOFeVZ+Z5akts/TfA+yZz6QkSfPj7QckqUHL/11IXVP93nKh31sp9MtbLkhXxyt3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1aMHCPcnWJCeTnEqya6HOI0l6twUJ9yQrgMeAfw3cCnwmya0LcS5J0rtdt0DjbgJOVdUPAZKMA9uAlxbofNKCOf43b/G5XV+95uc9/fDd1/yci219n6/zQ7dfHOqfSYuv9UKF+1rgTM/+JPCvFuhcUpP6Dbp+XU0gthh2VzLs1/pq/MnWGxZk3FTV8AdN7gXuqqp/2+1/FthUVQ/29NkJ7Ox2PwqcnMcpbwL+bh7HLxWt1AHWshS1UgdYyyW/UFUfnumJhbpynwRu6dlfB5zt7VBVe4G9wzhZkm9X1egwxlpMrdQB1rIUtVIHWEs/FurTMt8CNibZkOR9wHbg8AKdS5J0mQW5cq+qi0n+PfB1YAXwhao6sRDnkiS920Ity1BVzwDPLNT4lxnK8s4S0EodYC1LUSt1gLXMaUHeUJUkLS5vPyBJDVrW4d7KLQ6SfCHJhSTfX+y5zFeSW5I8l+TlJCeSfH6x5zSIJO9PcjTJ97o6fn+x5zRfSVYk+W6Sryz2XOYjyekkx5O8mOTbiz2fQSVZleRLSX7Q/X355FDHX67LMt0tDv4X8GtMf/TyW8BnqmrZfQs2yaeAKeBPq+pfLvZ85iPJGmBNVX0nyc8Dx4B7ltufS5IAN1TVVJL3At8EPl9Vf7nIUxtYkv8EjAIfrKpPL/Z8BpXkNDBaVcv6c+5J9gP/s6qe6D5V+HNV9eawxl/OV+7/7xYHVfUz4NItDpadqvoL4PXFnscwVNW5qvpOt/0PwMtMf2N5WalpU93ue7uf5XklBCRZB9wNPLHYcxEk+SDwKWAfQFX9bJjBDss73Ge6xcGyC5GWJVkPfAJ4YZGnMpBuGeNF4ALwbFUtyzo6fwj8DvCPizyPYSjgG0mOdd90X44+Avwt8MfdUtkTSYZ6H4LlHO6ZoW3ZXlm1JslK4CngP1bV3y/2fAZRVe9U1ceZ/ob1piTLcsksyaeBC1V1bLHnMiR3VtUvMX3X2Qe6Zc3l5jrgl4D/VlWfAH4CDPV9w+Uc7nPe4kCLo1ujfgo4UFVfXuz5zFf3z+UJYOvizmRgdwK/2a1VjwO/kuTJxZ3S4KrqbPd4AXia6SXa5WYSmOz51+CXmA77oVnO4e4tDpag7o3IfcDLVfUHiz2fQSX5cJJV3fYHgF8FfrCokxpQVe2uqnVVtZ7pvyd/XlW/tcjTGkiSG7o36umWMX4dWHafMquqHwFnkny0a9rCkG+JvmDfUF1oLd3iIMkXgTHgpiSTwO9V1b7FndXA7gQ+Cxzv1qsBfrf7xvJysgbY330q6z3Awapa1h8hbMQI8PT0NQTXAX9WVV9b3CkN7EHgQHdx+kPgt4c5+LL9KKQkaXbLeVlGkjQLw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAb9X0aaYO0xSCCnAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "df.Parch.hist()" ] }, { "cell_type": "markdown", "id": "92e95229", "metadata": {}, "source": [ "### Задание\n", "Постройте гистограмму на которой будет отображено количество людей в каждом из классов обслуживания" ] }, { "cell_type": "code", "execution_count": null, "id": "f3b54a84", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "1ed5afe4", "metadata": {}, "source": [ "Из датафрейма можно выделять подвыборки и делать их самостоятельными датафреймами:" ] }, { "cell_type": "code", "execution_count": 47, "id": "b7c07681", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.00234774211.1333NaNS
91012Nasser, Mrs. Nicholas (Adele Achem)female14.01023773630.0708NaNC
.......................................
88088112Shelley, Mrs. William (Imanita Parrish Hall)female25.00123043326.0000NaNS
88288303Dahlberg, Miss. Gerda Ulrikafemale22.000755210.5167NaNS
88588603Rice, Mrs. William (Margaret Norton)female39.00538265229.1250NaNQ
88788811Graham, Miss. Margaret Edithfemale19.00011205330.0000B42S
88888903Johnston, Miss. Catherine Helen \"Carrie\"femaleNaN12W./C. 660723.4500NaNS
\n", "

314 rows × 12 columns

\n", "
" ], "text/plain": [ " PassengerId Survived Pclass \\\n", "1 2 1 1 \n", "2 3 1 3 \n", "3 4 1 1 \n", "8 9 1 3 \n", "9 10 1 2 \n", ".. ... ... ... \n", "880 881 1 2 \n", "882 883 0 3 \n", "885 886 0 3 \n", "887 888 1 1 \n", "888 889 0 3 \n", "\n", " Name Sex Age SibSp \\\n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n", "2 Heikkinen, Miss. Laina female 26.0 0 \n", "3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n", "8 Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) female 27.0 0 \n", "9 Nasser, Mrs. Nicholas (Adele Achem) female 14.0 1 \n", ".. ... ... ... ... \n", "880 Shelley, Mrs. William (Imanita Parrish Hall) female 25.0 0 \n", "882 Dahlberg, Miss. Gerda Ulrika female 22.0 0 \n", "885 Rice, Mrs. William (Margaret Norton) female 39.0 0 \n", "887 Graham, Miss. Margaret Edith female 19.0 0 \n", "888 Johnston, Miss. Catherine Helen \"Carrie\" female NaN 1 \n", "\n", " Parch Ticket Fare Cabin Embarked \n", "1 0 PC 17599 71.2833 C85 C \n", "2 0 STON/O2. 3101282 7.9250 NaN S \n", "3 0 113803 53.1000 C123 S \n", "8 2 347742 11.1333 NaN S \n", "9 0 237736 30.0708 NaN C \n", ".. ... ... ... ... ... \n", "880 1 230433 26.0000 NaN S \n", "882 0 7552 10.5167 NaN S \n", "885 5 382652 29.1250 NaN Q \n", "887 0 112053 30.0000 B42 S \n", "888 2 W./C. 6607 23.4500 NaN S \n", "\n", "[314 rows x 12 columns]" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Выделим в отдельную таблицу всех женщин\n", "df_2 = df[df.Sex == 'female']\n", "df_2" ] }, { "cell_type": "code", "execution_count": 48, "id": "15eef0ed", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NameAge
0Braund, Mr. Owen Harris22.0
1Cumings, Mrs. John Bradley (Florence Briggs Th...38.0
2Heikkinen, Miss. Laina26.0
3Futrelle, Mrs. Jacques Heath (Lily May Peel)35.0
4Allen, Mr. William Henry35.0
.........
886Montvila, Rev. Juozas27.0
887Graham, Miss. Margaret Edith19.0
888Johnston, Miss. Catherine Helen \"Carrie\"NaN
889Behr, Mr. Karl Howell26.0
890Dooley, Mr. Patrick32.0
\n", "

891 rows × 2 columns

\n", "
" ], "text/plain": [ " Name Age\n", "0 Braund, Mr. Owen Harris 22.0\n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... 38.0\n", "2 Heikkinen, Miss. Laina 26.0\n", "3 Futrelle, Mrs. Jacques Heath (Lily May Peel) 35.0\n", "4 Allen, Mr. William Henry 35.0\n", ".. ... ...\n", "886 Montvila, Rev. Juozas 27.0\n", "887 Graham, Miss. Margaret Edith 19.0\n", "888 Johnston, Miss. Catherine Helen \"Carrie\" NaN\n", "889 Behr, Mr. Karl Howell 26.0\n", "890 Dooley, Mr. Patrick 32.0\n", "\n", "[891 rows x 2 columns]" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Выделим датафрейм хранящий только имена и возраст\n", "df_3 = df[['Name', 'Age']]\n", "df_3" ] }, { "cell_type": "markdown", "id": "985f01cf", "metadata": {}, "source": [ "### Задание\n", "Создайте датафрейм где будут поля Name и Survived для пассажиров без детей" ] }, { "cell_type": "code", "execution_count": null, "id": "6b462754", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "c73eba53", "metadata": {}, "source": [ "Для преобразования значений столбца можно применять метод .apply с указанием в нем функции, применяемой к значению каждой строки. В данном примере применяется анонимная функция (лямбда-функция)." ] }, { "cell_type": "code", "execution_count": 49, "id": "4d58666a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedisAdult
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNSTrue
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85CTrue
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNSTrue
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123STrue
4503Allen, Mr. William Henrymale35.0003734508.0500NaNSTrue
..........................................
88688702Montvila, Rev. Juozasmale27.00021153613.0000NaNSTrue
88788811Graham, Miss. Margaret Edithfemale19.00011205330.0000B42STrue
88888903Johnston, Miss. Catherine Helen \"Carrie\"femaleNaN12W./C. 660723.4500NaNSFalse
88989011Behr, Mr. Karl Howellmale26.00011136930.0000C148CTrue
89089103Dooley, Mr. Patrickmale32.0003703767.7500NaNQTrue
\n", "

891 rows × 13 columns

\n", "
" ], "text/plain": [ " PassengerId Survived Pclass \\\n", "0 1 0 3 \n", "1 2 1 1 \n", "2 3 1 3 \n", "3 4 1 1 \n", "4 5 0 3 \n", ".. ... ... ... \n", "886 887 0 2 \n", "887 888 1 1 \n", "888 889 0 3 \n", "889 890 1 1 \n", "890 891 0 3 \n", "\n", " Name Sex Age SibSp \\\n", "0 Braund, Mr. Owen Harris male 22.0 1 \n", "1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n", "2 Heikkinen, Miss. Laina female 26.0 0 \n", "3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n", "4 Allen, Mr. William Henry male 35.0 0 \n", ".. ... ... ... ... \n", "886 Montvila, Rev. Juozas male 27.0 0 \n", "887 Graham, Miss. Margaret Edith female 19.0 0 \n", "888 Johnston, Miss. Catherine Helen \"Carrie\" female NaN 1 \n", "889 Behr, Mr. Karl Howell male 26.0 0 \n", "890 Dooley, Mr. Patrick male 32.0 0 \n", "\n", " Parch Ticket Fare Cabin Embarked isAdult \n", "0 0 A/5 21171 7.2500 NaN S True \n", "1 0 PC 17599 71.2833 C85 C True \n", "2 0 STON/O2. 3101282 7.9250 NaN S True \n", "3 0 113803 53.1000 C123 S True \n", "4 0 373450 8.0500 NaN S True \n", ".. ... ... ... ... ... ... \n", "886 0 211536 13.0000 NaN S True \n", "887 0 112053 30.0000 B42 S True \n", "888 2 W./C. 6607 23.4500 NaN S False \n", "889 0 111369 30.0000 C148 C True \n", "890 0 370376 7.7500 NaN Q True \n", "\n", "[891 rows x 13 columns]" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['isAdult'] = df['Age'].apply(lambda x: x > 18)\n", "df" ] }, { "cell_type": "markdown", "id": "51e5b6be", "metadata": {}, "source": [ "### Задание\n", "Посчитайте среднее значение цены билета на корабле. Введите новый столбец, который будет показывать дешевле или дороже средней цены был купленный билет у каждого человека. Можете заполнить поля значениями True/False или строчными представлениями \"дешевле\"/\"дороже\"" ] }, { "cell_type": "code", "execution_count": null, "id": "167386f6", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 5 }