Java實(shí)現(xiàn)幾十萬條數(shù)據(jù)插入實(shí)例教程(30萬條數(shù)據(jù)插入MySQL僅需13秒)

更新時(shí)間：2023年04月07日 09:23:52 作者：共飲一杯無

這篇文章主要給大家介紹了關(guān)于Java如何實(shí)現(xiàn)幾十萬條數(shù)據(jù)插入的相關(guān)資料,30萬條數(shù)據(jù)插入MySQL僅需13秒,文中通過實(shí)例代碼介紹的非常詳細(xì),對大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友可以參考下

實(shí)體類、mapper和配置文件定義

User實(shí)體

/**
 * <p>用戶實(shí)體</p>
 *
 * @Author zjq
 * @Date 2021/8/3
 */
@Data
public class User {

    private int id;
    private String username;
    private int age;
}

mapper接口

public interface UserMapper {

    /**
     * 批量插入用戶
     * @param userList
     */
    void batchInsertUser(@Param("list") List<User> userList);
}

mapper.xml文件

    <!-- 批量插入用戶信息 -->
    <insert id="batchInsertUser" parameterType="java.util.List">
        insert into t_user(username,age) values
        <foreach collection="list" item="item" index="index" separator=",">
            (
            #{item.username},
            #{item.age}
            )
        </foreach>
    </insert>

jdbc.properties

jdbc.driver=com.mysql.jdbc.Driver
jdbc.url=jdbc:mysql://localhost:3306/test
jdbc.username=root
jdbc.password=root

sqlMapConfig.xml

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE configuration PUBLIC "-//mybatis.org//DTD Config 3.0//EN" "http://mybatis.org/dtd/mybatis-3-config.dtd">
<configuration>

    <!--通過properties標(biāo)簽加載外部properties文件-->
    <properties resource="jdbc.properties"></properties>

    <!--自定義別名-->
    <typeAliases>
        <typeAlias type="com.zjq.domain.User" alias="user"></typeAlias>
    </typeAliases>

    <!--數(shù)據(jù)源環(huán)境-->
    <environments default="developement">
        <environment id="developement">
            <transactionManager type="JDBC"></transactionManager>
            <dataSource type="POOLED">
                <property name="driver" value="${jdbc.driver}"/>
                <property name="url" value="${jdbc.url}"/>
                <property name="username" value="${jdbc.username}"/>
                <property name="password" value="${jdbc.password}"/>
            </dataSource>
        </environment>
    </environments>

    <!--加載映射文件-->
    <mappers>
        <mapper resource="com/zjq/mapper/UserMapper.xml"></mapper>
    </mappers>
</configuration>

不分批次直接梭哈

MyBatis直接一次性批量插入30萬條，代碼如下：

    @Test
    public void testBatchInsertUser() throws IOException {
        InputStream resourceAsStream =
                Resources.getResourceAsStream("sqlMapConfig.xml");
        SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);
        SqlSession session = sqlSessionFactory.openSession();
        System.out.println("===== 開始插入數(shù)據(jù) =====");
        long startTime = System.currentTimeMillis();
        try {
            List<User> userList = new ArrayList<>();
            for (int i = 1; i <= 300000; i++) {
                User user = new User();
                user.setId(i);
                user.setUsername("共飲一杯無 " + i);
                user.setAge((int) (Math.random() * 100));
                userList.add(user);
            }
            session.insert("batchInsertUser", userList); // 最后插入剩余的數(shù)據(jù)
            session.commit();

            long spendTime = System.currentTimeMillis()-startTime;
            System.out.println("成功插入 30 萬條數(shù)據(jù),耗時(shí)："+spendTime+"毫秒");
        } finally {
            session.close();
        }
    }

可以看到控制臺(tái)輸出：

Cause: com.mysql.jdbc.PacketTooBigException: Packet for query is too large (27759038 >yun 4194304). You can change this value on the server by setting the max_allowed_packet’ variable.

超出最大數(shù)據(jù)包限制了，可以通過調(diào)整max_allowed_packet限制來提高可以傳輸?shù)膬?nèi)容，不過由于30萬條數(shù)據(jù)超出太多，這個(gè)不可取，梭哈看來是不行了 ??????

既然梭哈不行那我們就一條一條循環(huán)著插入行不行呢

循環(huán)逐條插入

mapper接口和mapper文件中新增單個(gè)用戶新增的內(nèi)容如下:

    /**
     * 新增單個(gè)用戶
     * @param user
     */
    void insertUser(User user);

    <!-- 新增用戶信息 -->
    <insert id="insertUser" parameterType="user">
        insert into t_user(username,age) values
            (
            #{username},
            #{age}
            )
    </insert>

調(diào)整執(zhí)行代碼如下：

    @Test
    public void testCirculateInsertUser() throws IOException {
        InputStream resourceAsStream =
                Resources.getResourceAsStream("sqlMapConfig.xml");
        SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);
        SqlSession session = sqlSessionFactory.openSession();
        System.out.println("===== 開始插入數(shù)據(jù) =====");
        long startTime = System.currentTimeMillis();
        try {
            for (int i = 1; i <= 300000; i++) {
                User user = new User();
                user.setId(i);
                user.setUsername("共飲一杯無 " + i);
                user.setAge((int) (Math.random() * 100));
                // 一條一條新增
                session.insert("insertUser", user);
                session.commit();
            }

            long spendTime = System.currentTimeMillis()-startTime;
            System.out.println("成功插入 30 萬條數(shù)據(jù),耗時(shí)："+spendTime+"毫秒");
        } finally {
            session.close();
        }
    }

執(zhí)行后可以發(fā)現(xiàn)磁盤IO占比飆升，一直處于高位。

等啊等等啊等，好久還沒執(zhí)行完

先不管他了太慢了先搞其他的，等會(huì)再來看看結(jié)果吧。

two thousand year later …

控制臺(tái)輸出如下：

總共執(zhí)行了14909367毫秒，換算出來是4小時(shí)八分鐘。太慢了。。

??????還是優(yōu)化下之前的批處理方案吧

MyBatis實(shí)現(xiàn)插入30萬條數(shù)據(jù)

先清理表數(shù)據(jù)，然后優(yōu)化批處理執(zhí)行插入：

-- 清空用戶表
TRUNCATE table  t_user;

以下是通過 MyBatis 實(shí)現(xiàn) 30 萬條數(shù)據(jù)插入代碼實(shí)現(xiàn)：

    /**
     * 分批次批量插入
     * @throws IOException
     */
    @Test
    public void testBatchInsertUser() throws IOException {
        InputStream resourceAsStream =
                Resources.getResourceAsStream("sqlMapConfig.xml");
        SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);
        SqlSession session = sqlSessionFactory.openSession();
        System.out.println("===== 開始插入數(shù)據(jù) =====");
        long startTime = System.currentTimeMillis();
        int waitTime = 10;
        try {
            List<User> userList = new ArrayList<>();
            for (int i = 1; i <= 300000; i++) {
                User user = new User();
                user.setId(i);
                user.setUsername("共飲一杯無 " + i);
                user.setAge((int) (Math.random() * 100));
                userList.add(user);
                if (i % 1000 == 0) {
                    session.insert("batchInsertUser", userList);
                    // 每 1000 條數(shù)據(jù)提交一次事務(wù)
                    session.commit();
                    userList.clear();

                    // 等待一段時(shí)間
                    Thread.sleep(waitTime * 1000);
                }
            }
            // 最后插入剩余的數(shù)據(jù)
            if(!CollectionUtils.isEmpty(userList)) {
                session.insert("batchInsertUser", userList);
                session.commit();
            }

            long spendTime = System.currentTimeMillis()-startTime;
            System.out.println("成功插入 30 萬條數(shù)據(jù),耗時(shí)："+spendTime+"毫秒");
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            session.close();
        }
    }

使用了 MyBatis 的批處理操作，將每 1000 條數(shù)據(jù)放在一個(gè)批次中插入，能夠較為有效地提高插入速度。同時(shí)請注意在循環(huán)插入時(shí)要帶有合適的等待時(shí)間和批處理大小，以防止出現(xiàn)內(nèi)存占用過高等問題。此外，還需要在配置文件中設(shè)置合理的連接池和數(shù)據(jù)庫的參數(shù)，以獲得更好的性能。

CPU和磁盤占用情況

在上面的示例中，我們每插入1000行數(shù)據(jù)就進(jìn)行一次批處理提交，并等待10秒鐘。這有助于控制內(nèi)存占用，并確保插入操作平穩(wěn)進(jìn)行。

五十分鐘執(zhí)行完畢，時(shí)間主要用在了等待上。

如果低谷時(shí)期執(zhí)行，CPU和磁盤性能又足夠的情況下，直接批處理不等待執(zhí)行：

    /**
     * 分批次批量插入
     * @throws IOException
     */
    @Test
    public void testBatchInsertUser() throws IOException {
        InputStream resourceAsStream =
                Resources.getResourceAsStream("sqlMapConfig.xml");
        SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(resourceAsStream);
        SqlSession session = sqlSessionFactory.openSession();
        System.out.println("===== 開始插入數(shù)據(jù) =====");
        long startTime = System.currentTimeMillis();
        int waitTime = 10;
        try {
            List<User> userList = new ArrayList<>();
            for (int i = 1; i <= 300000; i++) {
                User user = new User();
                user.setId(i);
                user.setUsername("共飲一杯無 " + i);
                user.setAge((int) (Math.random() * 100));
                userList.add(user);
                if (i % 1000 == 0) {
                    session.insert("batchInsertUser", userList);
                    // 每 1000 條數(shù)據(jù)提交一次事務(wù)
                    session.commit();
                    userList.clear();
                }
            }
            // 最后插入剩余的數(shù)據(jù)
            if(!CollectionUtils.isEmpty(userList)) {
                session.insert("batchInsertUser", userList);
                session.commit();
            }

            long spendTime = System.currentTimeMillis()-startTime;
            System.out.println("成功插入 30 萬條數(shù)據(jù),耗時(shí)："+spendTime+"毫秒");
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            session.close();
        }
    }

則24秒可以完成數(shù)據(jù)插入操作：

可以看到短時(shí)CPU和磁盤占用會(huì)飆高。

把批處理的量再調(diào)大一些調(diào)到5000，在執(zhí)行：

13秒插入成功30萬條，直接蕪湖起飛??????

JDBC實(shí)現(xiàn)插入30萬條數(shù)據(jù)

JDBC循環(huán)插入的話跟上面的mybatis逐條插入類似，不再贅述。

以下是 Java 使用 JDBC 批處理實(shí)現(xiàn) 30 萬條數(shù)據(jù)插入的示例代碼。請注意，該代碼僅提供思路，具體實(shí)現(xiàn)需根據(jù)實(shí)際情況進(jìn)行修改。

    /**
     * JDBC分批次批量插入
     * @throws IOException
     */
    @Test
    public void testJDBCBatchInsertUser() throws IOException {
        Connection connection = null;
        PreparedStatement preparedStatement = null;

        String databaseURL = "jdbc:mysql://localhost:3306/test";
        String user = "root";
        String password = "root";

        try {
            connection = DriverManager.getConnection(databaseURL, user, password);
            // 關(guān)閉自動(dòng)提交事務(wù)，改為手動(dòng)提交
            connection.setAutoCommit(false);
            System.out.println("===== 開始插入數(shù)據(jù) =====");
            long startTime = System.currentTimeMillis();
            String sqlInsert = "INSERT INTO t_user ( username, age) VALUES ( ?, ?)";
            preparedStatement = connection.prepareStatement(sqlInsert);

            Random random = new Random();
            for (int i = 1; i <= 300000; i++) {
                preparedStatement.setString(1, "共飲一杯無 " + i);
                preparedStatement.setInt(2, random.nextInt(100));
                // 添加到批處理中
                preparedStatement.addBatch();

                if (i % 1000 == 0) {
                    // 每1000條數(shù)據(jù)提交一次
                    preparedStatement.executeBatch();
                    connection.commit();
                    System.out.println("成功插入第 "+ i+" 條數(shù)據(jù)");
                }

            }
            // 處理剩余的數(shù)據(jù)
            preparedStatement.executeBatch();
            connection.commit();
            long spendTime = System.currentTimeMillis()-startTime;
            System.out.println("成功插入 30 萬條數(shù)據(jù),耗時(shí)："+spendTime+"毫秒");
        } catch (SQLException e) {
            System.out.println("Error: " + e.getMessage());
        } finally {
            if (preparedStatement != null) {
                try {
                    preparedStatement.close();
                } catch (SQLException e) {
                    e.printStackTrace();
                }
            }

            if (connection != null) {
                try {
                    connection.close();
                } catch (SQLException e) {
                    e.printStackTrace();
                }
            }
        }
    }

上述示例代碼中，我們通過 JDBC 連接 MySQL 數(shù)據(jù)庫，并執(zhí)行批處理操作插入數(shù)據(jù)。具體實(shí)現(xiàn)步驟如下：

獲取數(shù)據(jù)庫連接。
創(chuàng)建 Statement 對象。
定義 SQL 語句，使用 PreparedStatement 對象預(yù)編譯 SQL 語句并設(shè)置參數(shù)。
執(zhí)行批處理操作。
處理剩余的數(shù)據(jù)。
關(guān)閉 Statement 和 Connection 對象。

使用setAutoCommit(false) 來禁止自動(dòng)提交事務(wù)，然后在每次批量插入之后手動(dòng)提交事務(wù)。每次插入數(shù)據(jù)時(shí)都新建一個(gè) PreparedStatement 對象以避免狀態(tài)不一致問題。在插入數(shù)據(jù)的循環(huán)中，每 10000 條數(shù)據(jù)就執(zhí)行一次 executeBatch() 插入數(shù)據(jù)。

另外，需要根據(jù)實(shí)際情況優(yōu)化連接池和數(shù)據(jù)庫的相關(guān)配置，以防止連接超時(shí)等問題。

總結(jié)

實(shí)現(xiàn)高效的大量數(shù)據(jù)插入需要結(jié)合以下優(yōu)化策略（建議綜合使用）：

批處理：批量提交SQL語句可以降低網(wǎng)絡(luò)傳輸和處理開銷，減少與數(shù)據(jù)庫交互的次數(shù)。在Java中可以使用Statement或者PreparedStatement的addBatch()方法來添加多個(gè)SQL語句，然后一次性執(zhí)行executeBatch()方法提交批處理的SQL語句。

在循環(huán)插入時(shí)帶有適當(dāng)?shù)牡却龝r(shí)間和批處理大小，從而避免內(nèi)存占用過高等問題：

設(shè)置適當(dāng)?shù)呐幚泶笮。号幚泶笮≈冈谝淮尾迦氩僮髦胁迦攵嗌傩袛?shù)據(jù)。如果批處理大小太小，插入操作的頻率將很高，而如果批處理大小太大，可能會(huì)導(dǎo)致內(nèi)存占用過高。通常，建議將批處理大小設(shè)置為1000-5000行，這將減少插入操作的頻率并降低內(nèi)存占用。
采用適當(dāng)?shù)牡却龝r(shí)間：等待時(shí)間指在批處理操作之間等待的時(shí)間量。等待時(shí)間過短可能會(huì)導(dǎo)致內(nèi)存占用過高，而等待時(shí)間過長則可能會(huì)延遲插入操作的速度。通常，建議將等待時(shí)間設(shè)置為幾秒鐘到幾十秒鐘之間，這將使操作變得平滑且避免出現(xiàn)內(nèi)存占用過高等問題。
可以考慮使用一些內(nèi)存優(yōu)化的技巧，例如使用內(nèi)存數(shù)據(jù)庫或使用游標(biāo)方式插入數(shù)據(jù)，以減少內(nèi)存占用。

總的來說，選擇適當(dāng)?shù)呐幚泶笮『偷却龝r(shí)間可以幫助您平穩(wěn)地進(jìn)行插入操作，避免出現(xiàn)內(nèi)存占用過高等問題。

索引: 在大量數(shù)據(jù)插入前暫時(shí)去掉索引，最后再打上，這樣可以大大減少寫入時(shí)候的更新索引的時(shí)間。

數(shù)據(jù)庫連接池：使用數(shù)據(jù)庫連接池可以減少數(shù)據(jù)庫連接建立和關(guān)閉的開銷，提高性能。在沒有使用數(shù)據(jù)庫連接池的情況，記得在finally中關(guān)閉相關(guān)連接。

數(shù)據(jù)庫參數(shù)調(diào)整：增加MySQL數(shù)據(jù)庫緩沖區(qū)大小、配置高性能的磁盤和I/O等。

到此這篇關(guān)于Java實(shí)現(xiàn)幾十萬條數(shù)據(jù)插入(30萬條數(shù)據(jù)插入MySQL僅需13秒)的文章就介紹到這了,更多相關(guān)Java實(shí)現(xiàn)幾十萬條數(shù)據(jù)插入內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章:

亚洲乱码中文字幕综合,中国熟女仑乱hd,亚洲精品乱拍国产一区二区三区,一本大道卡一卡二卡三乱码全集资源,又粗又黄又硬又爽的免费视频

軟件下載

源碼下載

軟件編程

網(wǎng)絡(luò)編程

在線工具

數(shù)據(jù)庫

CMS

常用工具

Java實(shí)現(xiàn)幾十萬條數(shù)據(jù)插入實(shí)例教程(30萬條數(shù)據(jù)插入MySQL僅需13秒)

目錄

實(shí)體類、mapper和配置文件定義

User實(shí)體

mapper接口

mapper.xml文件

jdbc.properties

sqlMapConfig.xml

不分批次直接梭哈

循環(huán)逐條插入

MyBatis實(shí)現(xiàn)插入30萬條數(shù)據(jù)

JDBC實(shí)現(xiàn)插入30萬條數(shù)據(jù)

總結(jié)

相關(guān)文章

最新評論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

Java實(shí)現(xiàn)幾十萬條數(shù)據(jù)插入實(shí)例教程(30萬條數(shù)據(jù)插入MySQL僅需13秒)

目錄

實(shí)體類、mapper和配置文件定義

User實(shí)體

mapper接口

mapper.xml文件

jdbc.properties

sqlMapConfig.xml

不分批次直接梭哈

循環(huán)逐條插入

MyBatis實(shí)現(xiàn)插入30萬條數(shù)據(jù)

JDBC實(shí)現(xiàn)插入30萬條數(shù)據(jù)

總結(jié)

相關(guān)文章

最新評論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

實(shí)體類、mapper和配置文件定義