A while ago, I was planning to build a simple search service for the project. Although the business database mongodb provides text search support, when a large number of documents need to be positioned through keywords, ES is obviously more suitable as a search engine (although most of us have used ELK's analysis and visualization features before). Elasticsearch is built on Lucene and supports extremely fast queries and rich query syntax, and occasionally serves as a lightweight NoSQL. However, the ability to complex query and aggregation operations is not very strong.
This article will not mention how to build a simple search service, but will record several pitfalls encountered during work hours of about a week. .
Why choose elasticsearch 5.x?
The new service has no historical burden, and in theory it should use the latest 6.x. However, spring-data-elasticsearch only supports 5.x, and it is difficult to directly encapsulate a layer of APIs even if the time is tight. It is also because the previous version of ELK's stuff was confused, so it had no choice but to go from 2.x to 5.x. Query the difference between 5.x and 2.x. Simply put, it is disk space -50%, index time -50%, query performance +25%.
Since spring-data-elasticsearch must be upgraded to 3.0.7, spring must be upgraded to 2.x, which directly leads to the pitfalls that are struck later.
Docker installation es will install x-path plugin by default
Although spring-data supports es5.x, its functions are not very complete. Therefore, if the x-path plug-in is installed, org.elasticsearch.client:x-pack-transport:5.5.0 needs to be introduced. The version must be the same as the es version, and you can implement TransportClient yourself, as follows
@Componentpublic class ESconfig { @Bean public TransportClient transportClient() throws UnknownHostException { TransportClient client = new PreBuiltXPackTransportClient(Settings.builder() .put("cluster.name", "docker-cluster") .put("xpack.security.user", "elastic:changeme") .build()) .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("0.0.0.0"), 9300)); return client; }}This is also a faster solution chosen because I don’t want to go to docker to deal with the x-path plug-in. If it is unnecessary, I don’t have to touch some things in es itself for the time being.
mq will save the message's class information, causing deserialized to fail
Rabbitmq in the title has never been mentioned, because it is just used as a message queue. When the data changes, the message id is thrown into mq and the consumer on the search service is consumed.
The problem is that when the message is thrown into mq, it is encapsulated into its own object, which causes the rabbitTemplate.receiveAndConvert to fail because the message will carry the Object package information. In desperation, the consumer can only directly obtain the message bytes in the queue, and convert the json form into an Object using the ObjectMapper.readValue method.
gradle configuration can use -Dloader.main to specify the startup function
It is precisely because mq is introduced that the search service needs to start a consumer. The method is to implement an Application that does not start the Web service, and configure a SimpleMessageListenerContainer and MessageListenerAdapter as follows:
@Bean SimpleMessageListenerContainer container(ConnectionFactory connectionFactory, MessageListenerAdapter listenerAdapter, MQconfig properties) { SimpleMessageListenerContainer container = new SimpleMessageListenerContainer(); container.setConnectionFactory(connectionFactory); container.setQueueNames(properties.getQueueName()); container.setMessageListener(listenerAdapter); return container; } @Bean MessageListenerAdapter listenerAdapter() { MessageListenerAdapter listenerAdapter = new MessageListenerAdapter(itemConsumer, "consume"); return listenerAdapter; } The problem is that when gradle configuration, I searched for a long time to make the jar package built with the specified -Dloader.main to specify the startup Application. The solution is as follows:
Add in xxx.gradle file
bootJar { manifest { attributes 'Main-Class': 'org.springframework.boot.loader.PropertiesLauncher' }}In the springboot 1.5.9 project, you need to specify the startup Application and need to add it
springBoot{ layout = "ZIP"}The way to check whether it is effective is to directly unzip the jar package after building and check it in xxx (project name)/META-INFO/MANIFEST.MF.
Main-Class: org.springframework.boot.loader.PropertiesLauncher
Then correct, if
Main-Class: org.springframework.boot.loader.JarLauncher
Start-Class in the file will still be started
es cannot modify the mapping of Index
Because it simply uses the text search function of es, there are many unsatisfactory search results in actual applications, such as searching for "desk", and it is impossible to search for content such as "computer desk/office desk" and other xx tables. There are many cases in this way. Therefore, synonym dictionary is added, and the ik_smart word segmenter is not used on the fields that require word segmentation, so the mapping of some fields needs to be changed to
// analyzer is its own word segmenter name @Field(type = FieldType.Text, index = true, analyzer = "synconym") private String description;
Since es' mapping cannot be modified, you can only manually create a new mapping, and then use the reIndex method to backfill the data (es5.x comes with reIndex API). There is a method online through alias. In some modification scenarios, you can smoothly modify mapping without restarting/deploying the application. You can query and understand for details.
The above is almost the pitfalls that have been touched by a search service. Several of them have consumed a lot of time and energy to solve. I hope it will be of reference value for this list. In the future, there will be some optimizations in the search service and will continue to be updated slowly. I also hope that everyone will support Wulin.com more.